Showing 1-8 of 8 projects
LeetCUDA is a comprehensive collection of modern CUDA learning resources, including 200+ CUDA kernels, Tensor Cores, HGEMM, and FA-2 MMA.
NVIDIA CUDA samples that demonstrate features of the CUDA Toolkit for GPU-accelerated development.
LMDeploy is a toolkit for compressing, deploying, and serving large language models (LLMs).
Rust-CUDA is an ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.
NVIDIA's CUDA Core Compute Libraries for accelerated computing and GPU programming in C++
Deep learning library for Rust with shape-checked tensors and neural networks
Kernl is a library that lets you run PyTorch transformer models several times faster on GPU with a single line of code.
Safe Rust wrapper around the CUDA toolkit for GPU acceleration in AI/ML applications.
Get weekly updates on trending AI coding tools and projects.