Explore Projects

Discover 17 open source projects

Active filters (1):
Search: model-servingร—
Clear all

Showing 1-17 of 17 projects

vllm-project/vllm

High-throughput LLM inference engine for developers

72.1K
Active
Python
Inference
LLM Wrappers & SDKs
Hugging Face
#llm#inference#ai

bentoml/BentoML

BentoML is an easy-to-use framework for building and deploying production-ready machine learning models as APIs.

8.5K
Active
Python
LLM Frameworks
API Clients & Testing
Python
#ai-inference#llm-inference#llm-serving

ahkarami/Deep-Learning-in-Production

A repository sharing notes and references on deploying deep learning models in production.

4.4K
Archived
React
#deep-learning#production#model-serving

beclab/Olares

Olares is an open-source personal cloud platform to reclaim your data and enable local AI computing.

4.2K
Active
Go
MCP Servers
Agents & Orchestration
Go
#ai-privacy#edge-ai#home-automation

FedML-AI/FedML

A unified and scalable ML library for large-scale distributed training, model serving, and federated learning.

4.0K
Stable
Python
ML Ops
Inference
React
#ai#machine-learning#federated-learning

ModelTC/LightLLM

LightLLM is a high-performance, scalable Python-based framework for inference and serving of large language models.

3.9K
Active
Python
LLM Frameworks
API Frameworks
#llm#model-serving#deep-learning

HuaizhengZhang/AI-Infra-from-Zero-to-Hero

A comprehensive collection of resources and tutorials for building AI infrastructure and systems.

3.7K
Experimental
LLM Frameworks
ML Ops
#ai-infrastructure#machine-learning#llm

predibase/lorax

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

3.7K
Experimental
Python
LLM Frameworks
BaaS Platforms
PyTorch
#llm#fine-tuning#model-serving

thu-pacman/chitu

High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.

3.4K
Active
Python
LLM Frameworks
API Frameworks
PyTorch
#llm#inference#gpu

vllm-project/vllm-omni

A Python framework for efficient model inference with omni-modality AI models.

2.9K
Active
Python
Inference
Multimodal
PyTorch
#audio-generation#diffusion#image-generation

tensorchord/envd

Reproducible development environment for humans and agents

2.2K
Active
Go
Go
#docker#development-environment#agent

vllm-project/vllm-ascend

A community-maintained hardware plugin for running large language models (LLMs) on Ascend accelerators.

1.7K
Active
C++
LLM Frameworks
Inference
#ascend#llm-serving#llmops

mlrun/mlrun

MLRun is an open-source MLOps platform for building and managing continuous ML applications.

1.7K
Active
Python
MLOps
API Frameworks
Python
#machine-learning#data-engineering#workflow

kitops-ml/kitops

An open-source DevOps tool for packaging and versioning AI/ML models, datasets, code, and configuration into an OCI Artifact.

1.3K
Active
Go
MLOps
Containerization
Go
#ai#devops#oci

logicalclocks/hopsworks

Hopsworks is a feature store and MLOps platform for data-intensive AI and machine learning applications.

1.3K
Experimental
Java
ML Ops
Feature Store
#feature-store#mlops#machine-learning

basetenlabs/truss

The simplest way to serve AI/ML models in production, with support for popular models like Stable Diffusion and Whisper.

1.1K
Active
Python
Inference
API Development
Flask
#artificial-intelligence#machine-learning#model-serving

alibaba/rtp-llm

RTP-LLM is a high-performance LLM inference engine from Alibaba for diverse AI applications.

1.1K
Active
Cuda
LLM Frameworks
LLM Inference
CUDA
#gpt#llama#llm

Stay in the loop

Get weekly updates on trending AI coding tools and projects.