Explore Projects

Discover 47 open source projects

Active filters (1):

Search: caption×

Clear all

Showing 21-40 of 47 projects

ttengwang/Caption-Anything

Caption-Anything is a versatile AI-powered tool for generating tailored image captions with diverse controls.

1.8K

Archived

Python

Computer Vision

LLM Frameworks

React

#image-captioning#controllable-generation#chatgpt

salesforce/ALBEF

A powerful vision-language pre-training method for tasks like image-text retrieval and captioning.

1.8K

Archived

Python

Computer Vision

Representation Learning

Python

#contrastive-learning#image-text#weakly-supervised-learning

abb128/LiveCaptions

A Linux desktop application that provides live captioning, useful for accessibility and inclusive communication.

1.7K

Experimental

Animation & Motion

CLI Tools

#accessibility#inclusive#captioning

BMSVieira/moovie.js

A JavaScript library for building customizable HTML5 video players with support for captions, controls, and streaming.

1.7K

Archived

JavaScript

Component Libraries (JavaScript)

Video Players

JavaScript

#player#captions#controls

yawiii/ComfyUI-Prompt-Assistant

A comprehensive prompt assistant that enables one-click access to LLMs/VLMs for prompt translation, expansion, and image captioning.

1.6K

Active

JavaScript

Prompt Engineering

MCP Frameworks

React

#prompt#translation#expansion

jcjohnson/densecap

A dense image captioning library in Torch for developers working on computer vision AI projects.

1.6K

Archived

Jupyter Notebook

Computer Vision

Torch

#computer-vision#image-captioning#deep-learning

google/live-transcribe-speech-engine

Live Transcribe is an Android app that provides real-time captioning for people who are deaf or hard of hearing.

1.5K

Archived

Java

AI Voice & Speech

#accessibility#captioning#transcription

ruotianluo/ImageCaptioning.pytorch

This PyTorch-based repository provides tools for developing image captioning models.

1.5K

Archived

Python

Computer Vision

PyTorch

#machine-learning#computer-vision#image-captioning

peteanderson80/bottom-up-attention

Bottom-up attention model for image captioning and visual question answering, built on Faster R-CNN and Visual Genome.

1.5K

Archived

Jupyter Notebook

Computer Vision

ML Ops

Caffe

#image-captioning#visual-question-answering#faster-rcnn

NVlabs/describe-anything

An implementation for detailed localized image and video captioning using large multimodal models.

1.5K

Experimental

Python

Computer Vision

LLM Frameworks

Python

#describe-anything#detailed-localized-captioning#large-multimodal-models

rmokady/CLIP_prefix_caption

A simple image captioning model built using the CLIP neural network for generating captions for images.

1.4K

Archived

Jupyter Notebook

Computer Vision

LLM Frameworks

PyTorch

#image-captioning#clip#computer-vision

royshil/obs-localvocal

An OBS plugin that enables real-time speech recognition and captioning using AI models like OpenAI Whisper.

1.4K

Active

C++

AI Voice & Speech

Realtime

#live-streaming#realtime-transcription#speech-recognition

imaginary-cloud/CameraManager

Simple Swift class to provide configurations for custom camera views in iOS apps.

1.4K

Archived

Swift

Component Libraries (Swift)

iOS

Swift

#camera#qrcode-reader#video-recording

gielcobben/caption

A cross-platform desktop application for generating captions and subtitles for video content.

1.4K

Archived

JavaScript

Desktop App

General Utilities

Electron

#subtitle#caption#video

IDEA-Research/DINO-X-API

An open-source AI-powered computer vision model for object detection, segmentation, and understanding.

1.3K

Experimental

Python

Computer Vision

API Frameworks

Python

#open-set-object-detection#open-set-object-segmentation#pose-estimation

NVlabs/prismer

Prismer: A Vision-Language Model with Multi-Task Experts for image-captioning and vision-language-model applications.

1.3K

Archived

Python

React

#vision-language-model#multi-task-learning#image-captioning

jhc13/taggui

A Python-based tool for managing and captioning image datasets, with support for various AI models and frameworks.

1.3K

Stable

Python

Computer Vision

Component Libraries (React)

#image-tagging#image-captioning#llava

ratwithacompiler/OBS-captions-plugin

A C++ plugin for OBS Studio that adds closed captioning functionality using Google Speech Recognition.

1.3K

Stable

C++

API Frameworks

AI Voice & Speech

#closed-captioning#speech-recognition#obs-studio

tylin/coco-caption

A library for generating captions for images using deep learning models.

1.2K

Archived

Jupyter Notebook

Computer Vision

Frontend Frameworks

Jupyter Notebook

#computer-vision#deep-learning#image-captioning

TheShadow29/awesome-grounding

A curated list of research papers on visual grounding, a key technique for multimodal AI.

1.1K

Stable

Computer Vision

Language Grounding

#computer-vision#language-grounding#multimodal-ai

Stay in the loop

Get weekly updates on trending AI coding tools and projects.