Showing 1-13 of 13 projects
Run open-source AI models locally with Ollama, supporting multiple frameworks and integrations.
LLaVA is a visual instruction tuning framework for large language and vision models, enabling GPT-4 level capabilities.
A Python library for using and fine-tuning over 900 large language models and multimodal models for various AI tasks.
SUPIR is a Python library for developing practical algorithms for photo-realistic image restoration using AI.
Open-source toolkit for evaluating large multi-modal AI models, supporting 220+ models and 80+ benchmarks.
A Chinese NLP solution with large models, data, training, and inference capabilities for developers.
A large-scale vision-language model for video understanding and generation.
An AI-powered file management tool that organizes local files while ensuring privacy.
A video conversation model that combines LLM capabilities with pretrained visual encoders for video-based chatbots.
Official codebase for OMG-LLaVA and OMG-Seg, state-of-the-art computer vision models presented at CVPR-24 and NeurIPS-24.
A Python-based tool for managing and captioning image datasets, with support for various AI models and frameworks.
Multimodal AI toolkit for fast content understanding and generation across text, images, and video
A curated list of famous vision-language models and their architectures for developers working with AI tools.
Get weekly updates on trending AI coding tools and projects.