Explore Projects

Discover 10 open source projects

Active filters (1):

Search: triton×

Showing 1-10 of 10 projects

triton-lang/triton

Triton is a development repository for a domain-specific programming language and compiler focused on machine learning and AI workloads.

18.6K

Active

MLIR

LLM Frameworks

#machine-learning#compilers#domain-specific-language

triton-inference-server/server

An optimized cloud and edge inference solution for deploying and running machine learning models.

10.4K

Active

Python

Inference

#machine-learning#deep-learning#gpu

ModelTC/LightLLM

LightLLM is a high-performance, scalable Python-based framework for inference and serving of large language models.

3.9K

Active

Python

LLM Frameworks

API Frameworks

#llm#model-serving#deep-learning

NVIDIA/GenerativeAIExamples

Collection of generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.

3.8K

Active

Jupyter Notebook

LLM Frameworks

Inference

React

#gpu-acceleration#large-language-models#microservice

thu-ml/SageAttention

Quantized attention that achieves 2-5x speedup over FlashAttention for language, image, and video models.

3.2K

Active

Cuda

Inference

Quantization

PyTorch

#attention#efficient-attention#inference-acceleration

ELS-RD/kernl

Kernl is a library that lets you run PyTorch transformer models several times faster on GPU with a single line of code.

1.6K

Active

Jupyter Notebook

LLM Frameworks

API Frameworks

PyTorch

#cuda#transformer#triton

ByteDance-Seed/Triton-distributed

Distributed compiler based on Triton for parallel systems, focused on AI and high-performance computing.

1.4K

Active

Python

MLP Frameworks

API Frameworks

Python

#distributed-computing#parallel-processing#high-performance

TritonDataCenter/triton

Triton DataCenter is a cloud management platform with first-class support for containers.

1.4K

Experimental

Shell

Containerization

#cloud#virtualization#containers

chengzeyi/stable-fast

Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.

1.3K

Experimental

Python

Inference

Performance Optimizations

PyTorch

#cuda#deeplearning#diffusers

TritonDataCenter/containerpilot

A Go-based service for auto-discovery and configuration of applications running in containers.

1.1K

Archived

Containerization

CLI Tools

#containers#service-discovery#orchestration

Stay in the loop

Get weekly updates on trending AI coding tools and projects.