Showing 1-20 of 133 projects
High-throughput LLM inference engine for developers
Course on building a Storyteller AI LLM from scratch in Python, C, and CUDA
Password recovery tool with GPU acceleration
High-performance serving framework for large language and multimodal models
Build and run Docker containers leveraging NVIDIA GPUs
Lightning fast neural graphics primitives for real-time 3D reconstruction, rendering, and more.
An open-source speech recognition toolkit used for building speech recognition systems.
Burn is a high-performance tensor library and deep learning framework for AI and scientific computing in Rust.
CUDA on non-NVIDIA GPUs, a Rust library for utilizing CUDA on a variety of GPU architectures.
Open3D is a modern C++ library for 3D data processing, including reconstruction, registration, and visualization.
TensorRT LLM provides a Python API and optimizations to efficiently run large language models on NVIDIA GPUs.
Open-source voice synthesis studio powered by Qwen3-TTS
A puzzle-based learning resource for developers to explore CUDA and machine learning.
A high-performance task-parallel programming system for C++ developers building concurrent and heterogeneous applications.
NumPy-aware dynamic Python compiler using LLVM, enabling fast, high-performance array and numerical computing.
A GPU-accelerated NumPy & SciPy library for high-performance scientific computing
LeetCUDA is a comprehensive collection of modern CUDA learning resources, including 200+ CUDA kernels, Tensor Cores, HGEMM, and FA-2 MMA.
A high-performance GPU DataFrame library for data analysis and machine learning workloads.
OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.
A high-performance linear algebra library for GPU-accelerated deep learning and other applications.
Get weekly updates on trending AI coding tools and projects.