Showing 1-20 of 53 projects
LLaVA is a visual instruction tuning framework for large language and vision models, enabling GPT-4 level capabilities.
On-device multimodal LLM for vision, speech, and live streaming on phones
A customizable, multi-modal AI chatbot that can be integrated with various chat platforms and leverages LLMs like ChatGPT, Bard, and GPT-3.
AgentScope is a Python library for building applications using agent-oriented programming and large language models.
A collection of multimodal large language models and their latest advances.
An all-in-one Retrieval-Augmented Generation (RAG) framework for building multi-modal AI applications.
Open source implementation of CLIP, a contrastive learning model for multi-modal tasks like zero-shot classification.
Scalable embedding, reasoning, ranking for images and sentences with CLIP
Open-source framework for building conversational voice AI agents
An open-source, large language model-based multimodal dialogue system that achieves near-GPT-4o performance.
Versatile database for AI, supporting storage, querying, versioning, and visualization of any AI data.
ModelScope is an open-source AI framework that brings the notion of Model-as-a-Service to life, providing a comprehensive suite of tools for building, deploying, and managing AI models.
AI suite with advanced AI/AGI functions, including personas, multi-model chats, text-to-image, voice, and more.
A state-of-the-art open visual language model for multimodal pretraining and applications.
A Python library for processing and analyzing data with foundation models and large language models.
Chinese version of CLIP for cross-modal retrieval and representation generation
Open-source routing engine for OpenStreetMap that provides advanced navigation features like directions, isochrones, and traveling salesman.
EasyR1 is an efficient and scalable multi-modality reinforcement learning training framework based on veRL.
An open toolkit for knowledge graph extraction and construction in Python.
OmniGen is a unified image generation library that supports diffusion models, multi-modal and multi-task learning.
Get weekly updates on trending AI coding tools and projects.