Explore Projects

Discover 4 open source projects

Active filters (1):
Search: large-multimodal-modelsร—
Clear all

Showing 1-4 of 4 projects

VITA-MLLM/VITA

A powerful multimodal AI model for real-time vision and speech interaction, built for developers who work with AI tools.

2.5K
Experimental
Python
LLM Frameworks
Agents & Orchestration
Python
#large-language-model#multimodal#video-understanding

OpenAdaptAI/OpenAdapt

An open-source framework for AI-powered process automation with support for large language, action, and multimodal models.

1.5K
Active
Python
LLM Frameworks
Agents & Orchestration
Python
#ai-agents#process-automation#large-language-models

NVlabs/describe-anything

An implementation for detailed localized image and video captioning using large multimodal models.

1.5K
Experimental
Python
Computer Vision
LLM Frameworks
Python
#describe-anything#detailed-localized-captioning#large-multimodal-models

ShareGPT4Omni/ShareGPT4Video

An official implementation of a system for improving video understanding and generation with better captions.

1.1K
Archived
Python
LLM Frameworks
Computer Vision
PyTorch
#chatgpt#gpt-4#computer-vision

Stay in the loop

Get weekly updates on trending AI coding tools and projects.