Explore Projects

Discover 178 open source projects

Active filters (1):
Search: multimodalร—
Clear all

Showing 1-20 of 178 projects

huggingface/transformers

Model framework for state-of-the-art ML models in text, vision, audio, and multimodal tasks.

157.4K
Active
Python
LLM Frameworks
Agents & Orchestration
PyTorch
#transformers#huggingface#deep-learning

Mintplex-Labs/anything-llm

All-in-one AI app for local and remote LLM usage with RAG, agents, and MCP compatibility

55.6K
Active
JavaScript
MCP Servers
Agent Coordination
JavaScript
#ai-agents#local-llm#rag

bytedance/UI-TARS-desktop

Multimodal AI agent stack for GUI and browser automation

28.6K
Stable
TypeScript
MCP Servers
Agents & Orchestration
TypeScript
#agent-tars#multimodal-ai#gui-agent

haotian-liu/LLaVA

LLaVA is a visual instruction tuning framework for large language and vision models, enabling GPT-4 level capabilities.

24.5K
Archived
Python
Computer Vision
LLM Frameworks
PyTorch
#llava#gpt-4#instruction-tuning

sgl-project/sglang

High-performance serving framework for large language and multimodal models

24.1K
Active
Python
Inference
LLM Frameworks
Python
#llm#inference#serving

OpenBMB/MiniCPM-o

On-device multimodal LLM for vision, speech, and live streaming on phones

24.0K
Active
Python
Inference
Local Inference Engines
llama.cpp-omni
#minicpm-o#multimodal-llm#on-device-ai

microsoft/unilm

Microsoft's research repo for large-scale self-supervised pre-training across tasks, languages, and modalities

22.0K
Active
Python
LLM Frameworks
#foundation-models#multimodal-ai#nlp

jina-ai/serve

Build and deploy AI services with cloud-native stack

21.8K
Experimental
Python
AI Model Serving
Containerization
Python
#cloud-native#ai-serving#docker

QwenLM/Qwen3-VL

Multimodal large language model series for developers

18.5K
Active
Jupyter Notebook
React
#LLM#Multimodal#Large Language Model

deepseek-ai/Janus

Janus-Series: Unified Multimodal Understanding and Generation Models for AI-powered vibe coders.

17.7K
Experimental
Python
LLM Frameworks
Python
#foundation-models#multimodal#unified-model

BradyFU/Awesome-Multimodal-Large-Language-Models

A collection of multimodal large language models and their latest advances.

17.4K
Active
React
#large-language-models#multimodal-chain-of-thought#in-context-learning

screenpipe/screenpipe

An open-source tool for recording screens and microphones, designed for developers building with AI tools.

17.1K
Active
TypeScript
Agents & Orchestration
TypeScript
#screen-recording#microphone-recording#ai-tools

NVIDIA-NeMo/NeMo

A scalable generative AI framework for researchers and developers

16.9K
Active
Python
React
#generative-ai#machine-learning#neural-networks

alibaba/MNN

A fast, lightweight deep learning framework used in Alibaba's business-critical use cases, supporting LLM and 3D avatar apps.

14.4K
Active
C++
LLM Frameworks
#deep-learning#machine-learning#llm

modelscope/ms-swift

A Python library for using and fine-tuning over 900 large language models and multimodal models for various AI tasks.

12.9K
Active
Python
LLM Frameworks
Python
#llm#multimodal#fine-tuning

duixcom/Duix-Avatar

An open-source AI avatar toolkit for offline video generation and digital human cloning.

12.4K
Stable
C
AI Image & Video
#ai-avatar#ai-avatars#cloning

salesforce/LAVIS

LAVIS is a comprehensive library for multimodal deep learning, including image captioning, visual question answering, and more.

11.2K
Archived
Jupyter Notebook
Vision-Language Transformer
PyTorch
#deep-learning#multimodal-learning#vision-language

pipecat-ai/pipecat

An open-source framework for building voice and multimodal conversational AI applications.

10.6K
Active
Python
LLM Frameworks
Python
#conversational-ai#voice-assistant#multimodal

rerun-io/rerun

An open source SDK for logging, storing, querying, and visualizing multimodal and multi-rate data.

10.3K
Active
Rust
Computer Vision
Rust
#multimodal#visualization#robotics

RunanywhereAI/runanywhere-sdks

Production ready AI toolkit for local AI inference

10.2K
Active
Kotlin
AI Coding Tools
#agent-framework#android#apple-intelligence
2...9

Stay in the loop

Get weekly updates on trending AI coding tools and projects.