Explore Projects

Discover 53 open source projects

Active filters (1):

Search: multi-modality×

Clear all

Showing 1-20 of 53 projects

haotian-liu/LLaVA

LLaVA is a visual instruction tuning framework for large language and vision models, enabling GPT-4 level capabilities.

24.5K

Archived

Python

Computer Vision

LLM Frameworks

PyTorch

#llava#gpt-4#instruction-tuning

OpenBMB/MiniCPM-o

On-device multimodal LLM for vision, speech, and live streaming on phones

24.0K

Active

Python

Inference

Local Inference Engines

llama.cpp-omni

#minicpm-o#multimodal-llm#on-device-ai

lss233/kirara-ai

A customizable, multi-modal AI chatbot that can be integrated with various chat platforms and leverages LLMs like ChatGPT, Bard, and GPT-3.

18.4K

Experimental

Python

LLM Frameworks

React

#chatbot#llm#ai-assistant

agentscope-ai/agentscope

AgentScope is a Python library for building applications using agent-oriented programming and large language models.

17.6K

Active

Python

LLM Frameworks

React

#agent#chatbot#llm

BradyFU/Awesome-Multimodal-Large-Language-Models

A collection of multimodal large language models and their latest advances.

17.4K

Active

React

#large-language-models#multimodal-chain-of-thought#in-context-learning

HKUDS/RAG-Anything

An all-in-one Retrieval-Augmented Generation (RAG) framework for building multi-modal AI applications.

14.0K

Active

Python

RAG & Vector

Python

#multi-modal-ai#retrieval-augmented-generation#rag-framework

mlfoundations/open_clip

Open source implementation of CLIP, a contrastive learning model for multi-modal tasks like zero-shot classification.

13.5K

Stable

Python

Computer Vision

PyTorch

#computer-vision#contrastive-learning#pretrained-model

jina-ai/clip-as-service

Scalable embedding, reasoning, ranking for images and sentences with CLIP

12.8K

Archived

Python

React

#authentication#streaming#real-time

TEN-framework/ten-framework

Open-source framework for building conversational voice AI agents

10.2K

Active

Python

React

#authentication#real-time#type-safe

OpenGVLab/InternVL

An open-source, large language model-based multimodal dialogue system that achieves near-GPT-4o performance.

9.9K

Stable

Python

LLM Frameworks

Python

#gpt#llm#multimodal

activeloopai/deeplake

Versatile database for AI, supporting storage, querying, versioning, and visualization of any AI data.

9.0K

Active

C++

LLM Frameworks

Vector Databases

PyTorch

#ai#data-storage#vector-database

modelscope/modelscope

ModelScope is an open-source AI framework that brings the notion of Model-as-a-Service to life, providing a comprehensive suite of tools for building, deploying, and managing AI models.

8.8K

Active

Python

LLM Frameworks

Computer Vision

Python

#machine-learning#deep-learning#computer-vision

enricoros/big-AGI

AI suite with advanced AI/AGI functions, including personas, multi-model chats, text-to-image, voice, and more.

6.9K

Active

TypeScript

LLM Frameworks

Agents & Orchestration

TypeScript

#agi#ai-agents#ai-suite

zai-org/CogVLM

A state-of-the-art open visual language model for multimodal pretraining and applications.

6.7K

Archived

Python

LLM Frameworks

Computer Vision

Python

#cross-modality#language-model#multi-modal

datajuicer/data-juicer

A Python library for processing and analyzing data with foundation models and large language models.

6.0K

Active

Python

LLM Frameworks

ETL & Pipelines

Python

#data-processing#data-analysis#foundation-models

OFA-Sys/Chinese-CLIP

Chinese version of CLIP for cross-modal retrieval and representation generation

5.8K

Stable

Jupyter Notebook

Computer Vision

LLM Frameworks

PyTorch

#chinese#clip#computer-vision

valhalla/valhalla

Open-source routing engine for OpenStreetMap that provides advanced navigation features like directions, isochrones, and traveling salesman.

5.5K

Active

C++

API Frameworks

Databases

#routing#directions#openstreetmap

hiyouga/EasyR1

EasyR1 is an efficient and scalable multi-modality reinforcement learning training framework based on veRL.

4.7K

Active

Python

Reinforcement Learning

#reinforcement-learning#ai#deepseek

zjunlp/DeepKE

An open toolkit for knowledge graph extraction and construction in Python.

4.3K

Experimental

Python

LLM Frameworks

Knowledge Graphs

PyTorch

#knowledge-graph#information-extraction#natural-language-processing

VectorSpaceLab/OmniGen

OmniGen is a unified image generation library that supports diffusion models, multi-modal and multi-task learning.

4.3K

Stable

Jupyter Notebook

Computer Vision

Image & Video

Jupyter Notebook

#diffusion#image-generation#multi-modal

2 3

Stay in the loop

Get weekly updates on trending AI coding tools and projects.