Explore Projects

Discover 31 open source projects

Active filters (1):
Search: vlmร—
Clear all

Showing 1-20 of 31 projects

huggingface/transformers

Model framework for state-of-the-art ML models in text, vision, audio, and multimodal tasks.

157.4K
Active
Python
LLM Frameworks
Agents & Orchestration
PyTorch
#transformers#huggingface#deep-learning

bytedance/UI-TARS-desktop

Multimodal AI agent stack for GUI and browser automation

28.6K
Stable
TypeScript
MCP Servers
Agents & Orchestration
TypeScript
#agent-tars#multimodal-ai#gui-agent

sgl-project/sglang

High-performance serving framework for large language and multimodal models

24.1K
Active
Python
Inference
LLM Frameworks
Python
#llm#inference#serving

roboflow/notebooks

A collection of tutorials and notebooks on state-of-the-art computer vision models and techniques for developers.

9.2K
Active
Jupyter Notebook
Computer Vision
Tutorials & Courses
PyTorch
#computer-vision#deep-learning#machine-learning

yzhao062/anomaly-detection-resources

A comprehensive collection of resources for anomaly detection, including books, papers, videos, and toolboxes.

9.2K
Stable
Python
LLM Frameworks
Computer Vision
Python
#anomaly-detection#outlier-detection#fraud-detection

oumi-ai/oumi

Easily fine-tune, evaluate and deploy open-source large language models like GPT-OSS and Llama.

8.9K
Active
Python
LLM Frameworks
Inference
Python
#llms#fine-tuning#evaluation

bytedance/Dolphin

Dolphin is a document image parsing library that uses heterogeneous anchor prompting for OCR and layout analysis.

8.9K
Stable
Python
Computer Vision
API Frameworks
Python
#document-analysis#layout-analysis#ocr

NexaAI/nexa-sdk

A comprehensive SDK for running frontier LLMs and VLMs on multiple hardware platforms with day-0 model support.

7.8K
Active
Go
LLM Frameworks
LLM Wrappers & SDKs
#llm#vlm#on-device-ai

PaddlePaddle/ERNIE

ERNIE is an industrial-grade development toolkit based on PaddlePaddle for building AI-powered applications.

7.7K
Active
Python
LLM Frameworks
AI Code Editors
Python
#ernie#llm#transformer

om-ai-lab/VLM-R1

A Python-based library for solving visual understanding tasks using reinforced visual-linguistic models (VLMs).

5.9K
Stable
Python
LLM Frameworks
Computer Vision
Python
#deepseek-r1#multimodal#reinforcement-learning

OpenBMB/UltraRAG

A low-code MCP framework for building complex and innovative RAG pipelines with AI tools.

5.4K
Active
Python
MCP Frameworks
RAG & Vector
Flask
#multimodal#low-code#rag

SkyworkAI/Skywork-R1V

An advanced multimodal AI model series for vision-language reasoning, developed by Skywork AI.

3.2K
Stable
Python
LLM Frameworks
Agents & Orchestration
Python
#multimodal#vision-language#reasoning

QiuYannnn/Local-File-Organizer

An AI-powered file management tool that organizes local files while ensuring privacy.

3.1K
Archived
Python
LLM Frameworks
LLM Wrappers & SDKs
Python
#file-organizer#llama3#llm

xlang-ai/OSWorld

A benchmark for multimodal AI agents to tackle open-ended tasks in real computer environments.

2.6K
Active
Python
Agents & Orchestration
Benchmark
Python
#multimodal-ai#agent-benchmarking#open-ended-tasks

modelscope/evalscope

A streamlined framework for efficient evaluation and performance benchmarking of large models like LLMs and VLMs.

2.5K
Active
Python
LLM Frameworks
Testing
Python
#evaluation#benchmarking#llm

BAAI-Agents/Cradle

A framework for building AI agents with strong reasoning abilities, self-improvement, and skill curation in a general computing environment.

2.5K
Archived
Python
Agents & Orchestration
LLM Frameworks
Python
#ai-agents#llm#general-computer-control

zai-org/GLM-V

A scalable multimodal reasoning framework for AI-powered applications with a focus on video and image understanding.

2.2K
Active
Python
LLM Frameworks
Agents & Orchestration
Python
#multimodal-reasoning#video-understanding#image-to-text

Achno/gowall

A go-based tool to process images with features like color palette extraction, OCR, upscaling, and more.

2.0K
Active
Go
Computer Vision
Backend Frameworks
#image-processing#color-palette#ocr

InternLM/Tutorial

A tutorial on using large language models (LLMs) and vision language models (VLMs) for various AI-powered coding tasks.

1.9K
Experimental
Python
LLM Frameworks
LLM Wrappers & SDKs
Python
#llm#vlm#tutorial

CryptoAILab/Awesome-LM-SSP

A curated reading list for security, safety, and privacy of large language models (LLMs) and AI systems.

1.9K
Active
LLM Frameworks
Security Research
#adversarial-attacks#diffusion-models#jailbreak
2

Stay in the loop

Get weekly updates on trending AI coding tools and projects.