Explore Projects

Discover 178 open source projects

Active filters (1):

Search: multimodal×

Clear all

Showing 1-20 of 178 projects

huggingface/transformers

Model framework for state-of-the-art ML models in text, vision, audio, and multimodal tasks.

157.4K

Active

Python

LLM Frameworks

Agents & Orchestration

PyTorch

#transformers#huggingface#deep-learning

Mintplex-Labs/anything-llm

All-in-one AI app for local and remote LLM usage with RAG, agents, and MCP compatibility

55.6K

Active

JavaScript

MCP Servers

Agent Coordination

JavaScript

#ai-agents#local-llm#rag

bytedance/UI-TARS-desktop

Multimodal AI agent stack for GUI and browser automation

28.6K

Stable

TypeScript

MCP Servers

Agents & Orchestration

TypeScript

#agent-tars#multimodal-ai#gui-agent

haotian-liu/LLaVA

LLaVA is a visual instruction tuning framework for large language and vision models, enabling GPT-4 level capabilities.

24.5K

Archived

Python

Computer Vision

LLM Frameworks

PyTorch

#llava#gpt-4#instruction-tuning

sgl-project/sglang

High-performance serving framework for large language and multimodal models

24.1K

Active

Python

Inference

LLM Frameworks

Python

#llm#inference#serving

OpenBMB/MiniCPM-o

On-device multimodal LLM for vision, speech, and live streaming on phones

24.0K

Active

Python

Inference

Local Inference Engines

llama.cpp-omni

#minicpm-o#multimodal-llm#on-device-ai

microsoft/unilm

Microsoft's research repo for large-scale self-supervised pre-training across tasks, languages, and modalities

22.0K

Active

Python

LLM Frameworks

#foundation-models#multimodal-ai#nlp

jina-ai/serve

Build and deploy AI services with cloud-native stack

21.8K

Experimental

Python

AI Model Serving

Containerization

Python

#cloud-native#ai-serving#docker

QwenLM/Qwen3-VL

Multimodal large language model series for developers

18.5K

Active

Jupyter Notebook

React

#LLM#Multimodal#Large Language Model

deepseek-ai/Janus

Janus-Series: Unified Multimodal Understanding and Generation Models for AI-powered vibe coders.

17.7K

Experimental

Python

LLM Frameworks

Python

#foundation-models#multimodal#unified-model

BradyFU/Awesome-Multimodal-Large-Language-Models

A collection of multimodal large language models and their latest advances.

17.4K

Active

React

#large-language-models#multimodal-chain-of-thought#in-context-learning

screenpipe/screenpipe

An open-source tool for recording screens and microphones, designed for developers building with AI tools.

17.1K

Active

TypeScript

Agents & Orchestration

TypeScript

#screen-recording#microphone-recording#ai-tools

NVIDIA-NeMo/NeMo

A scalable generative AI framework for researchers and developers

16.9K

Active

Python

React

#generative-ai#machine-learning#neural-networks

alibaba/MNN

A fast, lightweight deep learning framework used in Alibaba's business-critical use cases, supporting LLM and 3D avatar apps.

14.4K

Active

C++

LLM Frameworks

#deep-learning#machine-learning#llm

modelscope/ms-swift

A Python library for using and fine-tuning over 900 large language models and multimodal models for various AI tasks.

12.9K

Active

Python

LLM Frameworks

Python

#llm#multimodal#fine-tuning

duixcom/Duix-Avatar

An open-source AI avatar toolkit for offline video generation and digital human cloning.

12.4K

Stable

AI Image & Video

#ai-avatar#ai-avatars#cloning

salesforce/LAVIS

LAVIS is a comprehensive library for multimodal deep learning, including image captioning, visual question answering, and more.

11.2K

Archived

Jupyter Notebook

Vision-Language Transformer

PyTorch

#deep-learning#multimodal-learning#vision-language

pipecat-ai/pipecat

An open-source framework for building voice and multimodal conversational AI applications.

10.6K

Active

Python

LLM Frameworks

Python

#conversational-ai#voice-assistant#multimodal

rerun-io/rerun

An open source SDK for logging, storing, querying, and visualizing multimodal and multi-rate data.

10.3K

Active

Rust

Computer Vision

Rust

#multimodal#visualization#robotics

RunanywhereAI/runanywhere-sdks

Production ready AI toolkit for local AI inference

10.2K

Active

Kotlin

AI Coding Tools

#agent-framework#android#apple-intelligence

2...9

Stay in the loop

Get weekly updates on trending AI coding tools and projects.