Explore Projects

Discover 2 open source projects

Active filters (1):

Search: inference-acceleration×

Showing 1-2 of 2 projects

Quantized attention that achieves 2-5x speedup over FlashAttention for language, image, and video models.

3.2K

Active

Cuda

Inference

Quantization

PyTorch

#attention#efficient-attention#inference-acceleration

A Python library for accelerating inference of video diffusion models using timestep embedding caching.

1.3K

Experimental

Python

Inference

Computer Vision

Python

#video-generation#diffusion-models#inference-acceleration

Get weekly updates on trending AI coding tools and projects.