Explore Projects

Discover 31 open source projects

Active filters (1):
Search: vllmร—
Clear all

Showing 1-20 of 31 projects

vllm-project/vllm

High-throughput LLM inference engine for developers

72.1K
Active
Python
Inference
LLM Wrappers & SDKs
Hugging Face
#llm#inference#ai

BerriAI/litellm

Python SDK and Proxy Server for calling 100+ LLM APIs with cost tracking and guardrails

37.9K
Active
Python
LLM Wrappers & SDKs
API Clients & Testing
Python
#llm-sdk#ai-gateway#cost-tracking

badlogic/pi-mono

An AI agent toolkit with a coding agent CLI, unified LLM API, TUI & web UI libraries, and Slack bot for vibe coders.

20.2K
Active
TypeScript
AI Coding Agents
LLM Frameworks
TypeScript
#ai-coding-tools#llm-frameworks#cli-tools

meta-llama/llama-cookbook

This is a comprehensive guide for building with the LLaMA language model, covering inference, fine-tuning, and end-to-end solutions.

18.2K
Stable
Jupyter Notebook
LLM Frameworks
PyTorch
#llama#language-models#fine-tuning

GeeeekExplorer/nano-vllm

A powerful Python library for working with large language models (LLMs) and natural language processing tasks.

12.0K
Stable
Python
LLM Frameworks
PyTorch
#llm#nlp#deep-learning

OpenRLHF/OpenRLHF

An open-source, scalable, and high-performance RL framework for building AI-powered applications and tools.

9.1K
Active
Python
LLM Frameworks
Agents & Orchestration
Ray
#reinforcement-learning#large-language-models#proximal-policy-optimization

xorbitsai/inference

Unified, production-ready inference API to run open-source, speech, and multimodal models on cloud, on-prem, or your laptop.

9.1K
Active
Python
LLM Frameworks
Inference
PyTorch
#artificial-intelligence#llm#inference

intel/ipex-llm

An accelerator for local LLM inference and fine-tuning on Intel XPUs, with seamless integration into popular LLM frameworks.

8.7K
Active
Python
LLM Frameworks
LLM Wrappers & SDKs
PyTorch
#llm#inference#fine-tuning

LMCache/LMCache

Supercharge your large language models (LLMs) with the fastest key-value cache layer for lightning-fast inference.

7.5K
Active
Python
LLM Wrappers & SDKs
Caching
PyTorch
#llm#inference#cache

OpenBMB/UltraRAG

A low-code MCP framework for building complex and innovative RAG pipelines with AI tools.

5.4K
Active
Python
MCP Frameworks
RAG & Vector
Flask
#multimodal#low-code#rag

katanaml/sparrow

Structured data extraction and instruction calling with ML, LLM and Vision LLM for AI-powered developers.

5.1K
Active
Python
LLM Frameworks
Computer Vision
Python
#computer-vision#gpt#huggingface-transformers

xlite-dev/Awesome-LLM-Inference

A curated list of awesome papers and code for optimizing LLM/VLM inference performance

5.0K
Active
Python
LLM Frameworks
LLM Wrappers & SDKs
#llm#inference#optimization

kvcache-ai/Mooncake

Mooncake is a serving platform for Kimi, a leading LLM service provided by Moonshot AI, focused on disaggregation, inference, and RDMA.

4.9K
Active
C++
LLM Frameworks
API Frameworks
C++
#llm#inference#rdma

vllm-project/aibrix

Infrastructure components for cost-efficient GenAI model inference and deployment

4.7K
Active
Go
AI Model Serving
Infrastructure as Code
Go
#llm-inference#genai-infrastructure#model-serving

gpustack/gpustack

Optimize AI inference performance on GPUs with this Python library for selecting and tuning inference engines.

4.6K
Active
Python
Inference
CLI Tools
Python
#ai-inference#gpu-acceleration#performance-optimization

Orchestra-Research/AI-research-SKILLs

Comprehensive open-source library of AI research and engineering skills for any AI model.

4.4K
Active
TeX
LLM Frameworks
Agents & Orchestration
#ai#machine-learning#claude

skyzh/tiny-llm

A course on building a tiny vLLM (virtual Large Language Model) and Qwen inference serving on Apple Silicon for systems engineers.

3.9K
Stable
Python
LLM Frameworks
LLM Wrappers & SDKs
Python
#llm#qwen#vllm

PaddlePaddle/FastDeploy

High-performance Inference and Deployment Toolkit for LLMs and VLMs

3.7K
Active
Python
PaddlePaddle
#inference#deployment#LLMs

vllm-project/semantic-router

A semantic router system for deploying and managing a mixture of AI models at the cloud, data center, and edge.

3.3K
Active
Go
LLM Frameworks
MCP Frameworks
Golang
#ai-gateway#bert-classification#llm

vllm-project/vllm-omni

A Python framework for efficient model inference with omni-modality AI models.

2.9K
Active
Python
Inference
Multimodal
PyTorch
#audio-generation#diffusion#image-generation
2

Stay in the loop

Get weekly updates on trending AI coding tools and projects.