Explore Projects

Discover 2 open source projects

Active filters (1):
Search: inference-accelerationร—
Clear all

Showing 1-2 of 2 projects

thu-ml/SageAttention

Quantized attention that achieves 2-5x speedup over FlashAttention for language, image, and video models.

3.2K
Active
Cuda
Inference
Quantization
PyTorch
#attention#efficient-attention#inference-acceleration

ali-vilab/TeaCache

A Python library for accelerating inference of video diffusion models using timestep embedding caching.

1.3K
Experimental
Python
Inference
Computer Vision
Python
#video-generation#diffusion-models#inference-acceleration

Stay in the loop

Get weekly updates on trending AI coding tools and projects.