Explore Projects

Discover 31 open source projects

Active filters (1):

Search: vllm×

Clear all

Showing 1-20 of 31 projects

vllm-project/vllm

High-throughput LLM inference engine for developers

72.1K

Active

Python

Inference

LLM Wrappers & SDKs

Hugging Face

#llm#inference#ai

BerriAI/litellm

Python SDK and Proxy Server for calling 100+ LLM APIs with cost tracking and guardrails

37.9K

Active

Python

LLM Wrappers & SDKs

API Clients & Testing

Python

#llm-sdk#ai-gateway#cost-tracking

badlogic/pi-mono

An AI agent toolkit with a coding agent CLI, unified LLM API, TUI & web UI libraries, and Slack bot for vibe coders.

20.2K

Active

TypeScript

AI Coding Agents

LLM Frameworks

TypeScript

#ai-coding-tools#llm-frameworks#cli-tools

meta-llama/llama-cookbook

This is a comprehensive guide for building with the LLaMA language model, covering inference, fine-tuning, and end-to-end solutions.

18.2K

Stable

Jupyter Notebook

LLM Frameworks

PyTorch

#llama#language-models#fine-tuning

GeeeekExplorer/nano-vllm

A powerful Python library for working with large language models (LLMs) and natural language processing tasks.

12.0K

Stable

Python

LLM Frameworks

PyTorch

#llm#nlp#deep-learning

OpenRLHF/OpenRLHF

An open-source, scalable, and high-performance RL framework for building AI-powered applications and tools.

9.1K

Active

Python

LLM Frameworks

Agents & Orchestration

Ray

#reinforcement-learning#large-language-models#proximal-policy-optimization

xorbitsai/inference

Unified, production-ready inference API to run open-source, speech, and multimodal models on cloud, on-prem, or your laptop.

9.1K

Active

Python

LLM Frameworks

Inference

PyTorch

#artificial-intelligence#llm#inference

intel/ipex-llm

An accelerator for local LLM inference and fine-tuning on Intel XPUs, with seamless integration into popular LLM frameworks.

8.7K

Active

Python

LLM Frameworks

LLM Wrappers & SDKs

PyTorch

#llm#inference#fine-tuning

LMCache/LMCache

Supercharge your large language models (LLMs) with the fastest key-value cache layer for lightning-fast inference.

7.5K

Active

Python

LLM Wrappers & SDKs

Caching

PyTorch

#llm#inference#cache

OpenBMB/UltraRAG

A low-code MCP framework for building complex and innovative RAG pipelines with AI tools.

5.4K

Active

Python

MCP Frameworks

RAG & Vector

Flask

#multimodal#low-code#rag

katanaml/sparrow

Structured data extraction and instruction calling with ML, LLM and Vision LLM for AI-powered developers.

5.1K

Active

Python

LLM Frameworks

Computer Vision

Python

#computer-vision#gpt#huggingface-transformers

xlite-dev/Awesome-LLM-Inference

A curated list of awesome papers and code for optimizing LLM/VLM inference performance

5.0K

Active

Python

LLM Frameworks

LLM Wrappers & SDKs

#llm#inference#optimization

kvcache-ai/Mooncake

Mooncake is a serving platform for Kimi, a leading LLM service provided by Moonshot AI, focused on disaggregation, inference, and RDMA.

4.9K

Active

C++

LLM Frameworks

API Frameworks

C++

#llm#inference#rdma

vllm-project/aibrix

Infrastructure components for cost-efficient GenAI model inference and deployment

4.7K

Active

AI Model Serving

Infrastructure as Code

#llm-inference#genai-infrastructure#model-serving

gpustack/gpustack

Optimize AI inference performance on GPUs with this Python library for selecting and tuning inference engines.

4.6K

Active

Python

Inference

CLI Tools

Python

#ai-inference#gpu-acceleration#performance-optimization

Orchestra-Research/AI-research-SKILLs

Comprehensive open-source library of AI research and engineering skills for any AI model.

4.4K

Active

TeX

LLM Frameworks

Agents & Orchestration

#ai#machine-learning#claude

skyzh/tiny-llm

A course on building a tiny vLLM (virtual Large Language Model) and Qwen inference serving on Apple Silicon for systems engineers.

3.9K

Stable

Python

LLM Frameworks

LLM Wrappers & SDKs

Python

#llm#qwen#vllm

PaddlePaddle/FastDeploy

High-performance Inference and Deployment Toolkit for LLMs and VLMs

3.7K

Active

Python

PaddlePaddle

#inference#deployment#LLMs

vllm-project/semantic-router

A semantic router system for deploying and managing a mixture of AI models at the cloud, data center, and edge.

3.3K

Active

LLM Frameworks

MCP Frameworks

Golang

#ai-gateway#bert-classification#llm

vllm-project/vllm-omni

A Python framework for efficient model inference with omni-modality AI models.

2.9K

Active

Python

Inference

Multimodal

PyTorch

#audio-generation#diffusion#image-generation

Stay in the loop

Get weekly updates on trending AI coding tools and projects.