Explore Projects

Discover 12 open source projects

Active filters (1):

Search: image-captioning×

Showing 1-12 of 12 projects

salesforce/LAVIS

LAVIS is a comprehensive library for multimodal deep learning, including image captioning, visual question answering, and more.

11.2K

Archived

Jupyter Notebook

Vision-Language Transformer

PyTorch

#deep-learning#multimodal-learning#vision-language

salesforce/BLIP

PyTorch code for Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

5.7K

Archived

Jupyter Notebook

React

#vision-language#pre-training#unified-vision-language

OpenGVLab/InternGPT

InternGPT is an open-source demo platform that showcases various AI models, including DragGAN, ChatGPT, ImageBind, and multimodal chat.

3.2K

Archived

Python

LLM Frameworks

Agents & Orchestration

React

#chatgpt#draggan#imagebind

sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning

A PyTorch tutorial for building an image captioning model using the Show, Attend, and Tell technique.

2.9K

Archived

Python

Computer Vision

Tutorials & Courses

PyTorch

#image-captioning#attention-mechanism#encoder-decoder

OFA-Sys/OFA

Official repository for the OFA (Unifying Architectures, Tasks, and Modalities) AI model, supporting various vision-language tasks.

2.6K

Archived

Python

LLM Frameworks

Computer Vision

PyTorch

#pretrained-models#multimodal#vision-language

ttengwang/Caption-Anything

Caption-Anything is a versatile AI-powered tool for generating tailored image captions with diverse controls.

1.8K

Archived

Python

Computer Vision

LLM Frameworks

React

#image-captioning#controllable-generation#chatgpt

peteanderson80/bottom-up-attention

Bottom-up attention model for image captioning and visual question answering, built on Faster R-CNN and Visual Genome.

1.5K

Archived

Jupyter Notebook

Computer Vision

ML Ops

Caffe

#image-captioning#visual-question-answering#faster-rcnn

imaginary-cloud/CameraManager

Simple Swift class to provide configurations for custom camera views in iOS apps.

1.4K

Archived

Swift

Component Libraries (Swift)

iOS

Swift

#camera#qrcode-reader#video-recording

NVlabs/prismer

Prismer: A Vision-Language Model with Multi-Task Experts for image-captioning and vision-language-model applications.

1.3K

Archived

Python

React

#vision-language-model#multi-task-learning#image-captioning

jhc13/taggui

A Python-based tool for managing and captioning image datasets, with support for various AI models and frameworks.

1.3K

Stable

Python

Computer Vision

Component Libraries (React)

#image-tagging#image-captioning#llava

microsoft/Oscar

An AI-powered image captioning and image-text search platform for developers building with AI tools.

1.1K

Archived

Python

Computer Vision

Fine-tuning

Python

#image-captioning#image-text-search#vision-and-language

ruotianluo/self-critical.pytorch

Unofficial PyTorch implementation of Self-critical Sequence Training for Image Captioning.

1.0K

Archived

Python

Computer Vision

LLM Frameworks

PyTorch

#image-captioning#deep-learning#computer-vision

Stay in the loop

Get weekly updates on trending AI coding tools and projects.