Showing 1-4 of 4 projects
Fast C++ inference engine for Transformer models, supporting CUDA, MKL, and other optimizations.
A C library for optimizing GEMM (General Matrix Multiplication) operations, focused on performance.
A high-performance GEMM library for deep learning inference on CPUs, developed by Facebook.
CLBlast is a tuned, open-source BLAS library for OpenCL, focused on providing efficient matrix-multiplication operations for GPU acceleration.
Get weekly updates on trending AI coding tools and projects.