Explore Projects

Discover 4 open source projects

Active filters (1):
Search: vision-language-pretrainingร—
Clear all

Showing 1-4 of 4 projects

deepseek-ai/Janus

Janus-Series: Unified Multimodal Understanding and Generation Models for AI-powered vibe coders.

17.7K
Experimental
Python
LLM Frameworks
Python
#foundation-models#multimodal#unified-model

salesforce/LAVIS

LAVIS is a comprehensive library for multimodal deep learning, including image captioning, visual question answering, and more.

11.2K
Archived
Jupyter Notebook
Vision-Language Transformer
PyTorch
#deep-learning#multimodal-learning#vision-language

DAMO-NLP-SG/Video-LLaMA

An open-source, instruction-tuned audio-visual language model for video understanding

3.1K
Archived
Python
React
#video-language-pretraining#vision-language-pretraining#cross-modal-pretraining

mbzuai-oryx/Video-ChatGPT

A video conversation model that combines LLM capabilities with pretrained visual encoders for video-based chatbots.

1.5K
Experimental
Python
LLM Frameworks
Computer Vision
PyTorch
#chatbot#video-conversation#vision-language

Stay in the loop

Get weekly updates on trending AI coding tools and projects.