Explore Projects

Discover 12 open source projects

Active filters (1):
Search: image-captioningร—
Clear all

Showing 1-12 of 12 projects

salesforce/LAVIS

LAVIS is a comprehensive library for multimodal deep learning, including image captioning, visual question answering, and more.

11.2K
Archived
Jupyter Notebook
Vision-Language Transformer
PyTorch
#deep-learning#multimodal-learning#vision-language

salesforce/BLIP

PyTorch code for Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

5.7K
Archived
Jupyter Notebook
React
#vision-language#pre-training#unified-vision-language

OpenGVLab/InternGPT

InternGPT is an open-source demo platform that showcases various AI models, including DragGAN, ChatGPT, ImageBind, and multimodal chat.

3.2K
Archived
Python
LLM Frameworks
Agents & Orchestration
React
#chatgpt#draggan#imagebind

sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning

A PyTorch tutorial for building an image captioning model using the Show, Attend, and Tell technique.

2.9K
Archived
Python
Computer Vision
Tutorials & Courses
PyTorch
#image-captioning#attention-mechanism#encoder-decoder

OFA-Sys/OFA

Official repository for the OFA (Unifying Architectures, Tasks, and Modalities) AI model, supporting various vision-language tasks.

2.6K
Archived
Python
LLM Frameworks
Computer Vision
PyTorch
#pretrained-models#multimodal#vision-language

ttengwang/Caption-Anything

Caption-Anything is a versatile AI-powered tool for generating tailored image captions with diverse controls.

1.8K
Archived
Python
Computer Vision
LLM Frameworks
React
#image-captioning#controllable-generation#chatgpt

peteanderson80/bottom-up-attention

Bottom-up attention model for image captioning and visual question answering, built on Faster R-CNN and Visual Genome.

1.5K
Archived
Jupyter Notebook
Computer Vision
ML Ops
Caffe
#image-captioning#visual-question-answering#faster-rcnn

imaginary-cloud/CameraManager

Simple Swift class to provide configurations for custom camera views in iOS apps.

1.4K
Archived
Swift
Component Libraries (Swift)
iOS
Swift
#camera#qrcode-reader#video-recording

NVlabs/prismer

Prismer: A Vision-Language Model with Multi-Task Experts for image-captioning and vision-language-model applications.

1.3K
Archived
Python
React
#vision-language-model#multi-task-learning#image-captioning

jhc13/taggui

A Python-based tool for managing and captioning image datasets, with support for various AI models and frameworks.

1.3K
Stable
Python
Computer Vision
Component Libraries (React)
#image-tagging#image-captioning#llava

microsoft/Oscar

An AI-powered image captioning and image-text search platform for developers building with AI tools.

1.1K
Archived
Python
Computer Vision
Fine-tuning
Python
#image-captioning#image-text-search#vision-and-language

ruotianluo/self-critical.pytorch

Unofficial PyTorch implementation of Self-critical Sequence Training for Image Captioning.

1.0K
Archived
Python
Computer Vision
LLM Frameworks
PyTorch
#image-captioning#deep-learning#computer-vision

Stay in the loop

Get weekly updates on trending AI coding tools and projects.