Showing 1-3 of 3 projects
A high-performance Transformer library for accelerating AI models on NVIDIA GPUs, including low-precision support.
Optimizes large language models for low-bit precision and sparsity, improving model compression techniques.
A low-precision matrix multiplication library for performance-critical AI and machine learning applications.
Get weekly updates on trending AI coding tools and projects.