Explore Projects

Discover 10 open source projects

Active filters (1):
Search: tritonร—
Clear all

Showing 1-10 of 10 projects

triton-lang/triton

Triton is a development repository for a domain-specific programming language and compiler focused on machine learning and AI workloads.

18.6K
Active
MLIR
LLM Frameworks
#machine-learning#compilers#domain-specific-language

triton-inference-server/server

An optimized cloud and edge inference solution for deploying and running machine learning models.

10.4K
Active
Python
Inference
#machine-learning#deep-learning#gpu

ModelTC/LightLLM

LightLLM is a high-performance, scalable Python-based framework for inference and serving of large language models.

3.9K
Active
Python
LLM Frameworks
API Frameworks
#llm#model-serving#deep-learning

NVIDIA/GenerativeAIExamples

Collection of generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.

3.8K
Active
Jupyter Notebook
LLM Frameworks
Inference
React
#gpu-acceleration#large-language-models#microservice

thu-ml/SageAttention

Quantized attention that achieves 2-5x speedup over FlashAttention for language, image, and video models.

3.2K
Active
Cuda
Inference
Quantization
PyTorch
#attention#efficient-attention#inference-acceleration

ELS-RD/kernl

Kernl is a library that lets you run PyTorch transformer models several times faster on GPU with a single line of code.

1.6K
Active
Jupyter Notebook
LLM Frameworks
API Frameworks
PyTorch
#cuda#transformer#triton

ByteDance-Seed/Triton-distributed

Distributed compiler based on Triton for parallel systems, focused on AI and high-performance computing.

1.4K
Active
Python
MLP Frameworks
API Frameworks
Python
#distributed-computing#parallel-processing#high-performance

TritonDataCenter/triton

Triton DataCenter is a cloud management platform with first-class support for containers.

1.4K
Experimental
Shell
Containerization
#cloud#virtualization#containers

chengzeyi/stable-fast

Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.

1.3K
Experimental
Python
Inference
Performance Optimizations
PyTorch
#cuda#deeplearning#diffusers

TritonDataCenter/containerpilot

A Go-based service for auto-discovery and configuration of applications running in containers.

1.1K
Archived
Go
Containerization
CLI Tools
#containers#service-discovery#orchestration

Stay in the loop

Get weekly updates on trending AI coding tools and projects.