Explore Projects

Discover 53 open source projects

Active filters (1):
Search: multi-modalityร—
Clear all

Showing 21-40 of 53 projects

ThilinaRajapakse/simpletransformers

A Python library that provides a simple interface for using popular NLP models like BERT, GPT-2, XLNet, and T5 for various tasks.

4.2K
Stable
Python
LLM Frameworks
Text Classification
Python
#transformers#natural-language-processing#text-classification

open-compass/VLMEvalKit

Open-source toolkit for evaluating large multi-modal AI models, supporting 220+ models and 80+ benchmarks.

3.9K
Active
Python
LLM Frameworks
LLM Wrappers & SDKs
PyTorch
#chatgpt#llm#multi-modal

PKU-YuanGroup/Video-LLaVA

A large-scale vision-language model for video understanding and generation.

3.5K
Archived
Python
LLM Frameworks
Computer Vision
Python
#large-vision-language-model#video-understanding#multi-modal

JIA-Lab-research/MGM

An official repository for the 'Mini-Gemini' model, a multi-modal vision-language model for generation tasks.

3.3K
Archived
Python
LLM Frameworks
Computer Vision
Python
#generation#large-language-models#vision-language-model

aurelio-labs/semantic-router

A Python library that provides superfast AI decision-making and intelligent multi-modal data processing.

3.3K
Stable
Python
LLM Frameworks
Agents & Orchestration
#ai#artificial-intelligence#computer-vision

docarray/docarray

A Python library for representing, sending, storing, and searching multimodal data in AI and ML applications.

3.1K
Active
Python
LLM Frameworks
Vector Databases
PyTorch
#cross-modal#multimodal#neural-search

InternLM/InternLM-XComposer

A comprehensive multimodal system for long-term streaming video and audio interactions using large language models.

2.9K
Experimental
Python
LLM Frameworks
Computer Vision
PyTorch
#chatgpt#gpt-4#multimodal

modelscope/3D-Speaker

A library for single- and multi-modal speaker verification, recognition, and diarization.

2.8K
Stable
Python
Computer Vision
AI Voice & Speech
Python
#speaker-verification#speaker-recognition#speaker-diarization

JIA-Lab-research/LISA

Project that provides a reasoning-based segmentation method using large language models.

2.6K
Experimental
Python
LLM Frameworks
Agents & Orchestration
Python
#large-language-model#llm#multi-modal

opentripplanner/OpenTripPlanner

An open source multi-modal trip planner for developers to build with AI tools.

2.6K
Active
Java
React
#trip-planner#AI-powered#open-source

X-PLUG/mPLUG-Owl

A powerful multi-modal large language model family for building advanced AI chatbots and visual recognition models.

2.5K
Experimental
Python
LLM Frameworks
Computer Vision
PyTorch
#chatbot#gpt#multimodal

zai-org/CogVLM2

An open-source multi-modal AI model based on LLaMA-3.8B for vibe coders to build with AI tools.

2.4K
Experimental
Python
LLM Frameworks
Computer Vision
Python
#open-source#multi-modal#language-model

Yuliang-Liu/Monkey

A Python library for working with large multi-modal models, focusing on image resolution and text labeling.

1.9K
Active
Python
Computer Vision
ML Ops
#computer-vision#multi-modal-models#image-resolution

black0017/MedicalZooPytorch

A PyTorch-based deep learning framework for 2D/3D medical image segmentation.

1.9K
Archived
Python
Computer Vision
API Frameworks
PyTorch
#medical-imaging#image-segmentation#deep-learning

OpenMotionLab/MotionGPT

MotionGPT is a unified motion-language generation model that can generate human motion using large language models.

1.9K
Experimental
Python
LLM Frameworks
Motion Generation
Python
#motion-generation#text-to-motion#chatgpt

Kav-K/GPTDiscord

A robust, all-in-one GPT interface for Discord with chatbot, image generation, moderation, and more

1.8K
Archived
Python
LLM Wrappers & SDKs
Authentication
asyncio
#chatbot#dalle2#discord-bot

IntelLabs/fastRAG

Efficient retrieval augmentation and generation framework for multi-modal information retrieval and question-answering

1.8K
Active
Python
LLM Frameworks
RAG Frameworks
PyTorch
#information-retrieval#question-answering#multi-modal

dingodb/dingo

A high-performance, MySQL-compatible vector database that supports structured and unstructured data for AI-driven applications.

1.7K
Active
Java
Vector Databases
API Frameworks
#vector-database#mysql-compatibility#structured-data

ByteDance-Seed/VeOmni

VeOmni is a scalable distributed training framework for multi-modal AI models, focused on vibe coders.

1.7K
Active
Python
MCP Frameworks
MLOps
Python
#distributed-training#multi-modal-ai#model-centric

lyuchenyang/Macaw-LLM

Macaw-LLM is a multi-modal language modeling framework that integrates image, video, audio, and text data.

1.6K
Archived
Python
LLM Frameworks
Computer Vision
PyTorch
#multi-modal-learning#deep-learning#natural-language-processing

Stay in the loop

Get weekly updates on trending AI coding tools and projects.