Explore Projects

Discover 1 open source projects

Active filters (1):

Search: efficient-attention×

Showing 1-1 of 1 projects

thu-ml/SageAttention

Quantized attention that achieves 2-5x speedup over FlashAttention for language, image, and video models.

3.2K

Active

Cuda

Inference

Quantization

PyTorch

#attention#efficient-attention#inference-acceleration

Stay in the loop

Get weekly updates on trending AI coding tools and projects.