flashinfer-ai/flashinfer

A Python library for serving large language models (LLMs) with high performance, including GPU acceleration and distributed inference.

Python
AI & Machine Learning
LLM Frameworks
Apache-2.0

5.1K

Stars

760

Forks

Jul 22, 2023

Created

Mar 5, 2026

Last Updated

Project Analytics

Stars Growth (1 Month)

+206

+4.2% change

Avg Daily Growth (1 Month)

+7.4

stars per day

Fork/Star Ratio (All Time)

15.0%

Good engagement

Lifetime Growth

5.3

stars/day over 959 days

Stars Over Time

Forks Over Time

Open Issues Over Time

Pull Requests Over Time

Commits Over Time

AI-Generated Tags

llm
inference
cuda
distributed
gpu
pytorch

Comments (0)

Sign in to leave a comment or vote

Sign In

No comments yet. Be the first to comment!

Stay in the loop

Get weekly updates on trending AI coding tools and projects.