Explore Projects

Discover 17 open source projects

Active filters (1):

Search: model-serving×

Clear all

Showing 1-17 of 17 projects

vllm-project/vllm

High-throughput LLM inference engine for developers

72.1K

Active

Python

Inference

LLM Wrappers & SDKs

Hugging Face

#llm#inference#ai

bentoml/BentoML

BentoML is an easy-to-use framework for building and deploying production-ready machine learning models as APIs.

8.5K

Active

Python

LLM Frameworks

API Clients & Testing

Python

#ai-inference#llm-inference#llm-serving

ahkarami/Deep-Learning-in-Production

A repository sharing notes and references on deploying deep learning models in production.

4.4K

Archived

React

#deep-learning#production#model-serving

beclab/Olares

Olares is an open-source personal cloud platform to reclaim your data and enable local AI computing.

4.2K

Active

MCP Servers

Agents & Orchestration

#ai-privacy#edge-ai#home-automation

FedML-AI/FedML

A unified and scalable ML library for large-scale distributed training, model serving, and federated learning.

4.0K

Stable

Python

ML Ops

Inference

React

#ai#machine-learning#federated-learning

ModelTC/LightLLM

LightLLM is a high-performance, scalable Python-based framework for inference and serving of large language models.

3.9K

Active

Python

LLM Frameworks

API Frameworks

#llm#model-serving#deep-learning

HuaizhengZhang/AI-Infra-from-Zero-to-Hero

A comprehensive collection of resources and tutorials for building AI infrastructure and systems.

3.7K

Experimental

LLM Frameworks

ML Ops

#ai-infrastructure#machine-learning#llm

predibase/lorax

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

3.7K

Experimental

Python

LLM Frameworks

BaaS Platforms

PyTorch

#llm#fine-tuning#model-serving

thu-pacman/chitu

High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.

3.4K

Active

Python

LLM Frameworks

API Frameworks

PyTorch

#llm#inference#gpu

vllm-project/vllm-omni

A Python framework for efficient model inference with omni-modality AI models.

2.9K

Active

Python

Inference

Multimodal

PyTorch

#audio-generation#diffusion#image-generation

tensorchord/envd

Reproducible development environment for humans and agents

2.2K

Active

#docker#development-environment#agent

vllm-project/vllm-ascend

A community-maintained hardware plugin for running large language models (LLMs) on Ascend accelerators.

1.7K

Active

C++

LLM Frameworks

Inference

#ascend#llm-serving#llmops

mlrun/mlrun

MLRun is an open-source MLOps platform for building and managing continuous ML applications.

1.7K

Active

Python

MLOps

API Frameworks

Python

#machine-learning#data-engineering#workflow

kitops-ml/kitops

An open-source DevOps tool for packaging and versioning AI/ML models, datasets, code, and configuration into an OCI Artifact.

1.3K

Active

MLOps

Containerization

#ai#devops#oci

logicalclocks/hopsworks

Hopsworks is a feature store and MLOps platform for data-intensive AI and machine learning applications.

1.3K

Experimental

Java

ML Ops

Feature Store

#feature-store#mlops#machine-learning

basetenlabs/truss

The simplest way to serve AI/ML models in production, with support for popular models like Stable Diffusion and Whisper.

1.1K

Active

Python

Inference

API Development

Flask

#artificial-intelligence#machine-learning#model-serving

alibaba/rtp-llm

RTP-LLM is a high-performance LLM inference engine from Alibaba for diverse AI applications.

1.1K

Active

Cuda

LLM Frameworks

LLM Inference

CUDA

#gpt#llama#llm

Stay in the loop

Get weekly updates on trending AI coding tools and projects.