Explore Projects

Discover 133 open source projects

Active filters (1):
Search: cudaร—
Clear all

Showing 41-60 of 133 projects

gpustack/gpustack

Optimize AI inference performance on GPUs with this Python library for selecting and tuning inference engines.

4.6K
Active
Python
Inference
CLI Tools
Python
#ai-inference#gpu-acceleration#performance-optimization

NVIDIA/nccl

An optimized library for efficient multi-GPU communication in deep learning applications.

4.5K
Active
C++
GPU
API Frameworks
CUDA
#communications#cuda#gpu

NVlabs/tiny-cuda-nn

A fast C++/CUDA neural network framework for high-performance deep learning and rendering.

4.4K
Stable
C++
Frameworks
API Frameworks
PyTorch
#cuda#deep-learning#gpu

OpenNMT/CTranslate2

Fast C++ inference engine for Transformer models, supporting CUDA, MKL, and other optimizations.

4.3K
Active
C++
Inference
API Frameworks
#deep-learning#machine-translation#neural-machine-translation

ROCm/hip

A C++ interface for portability across different heterogeneous computing platforms like CUDA and HIP.

4.3K
Active
C++
API Frameworks
Build Tools
#cuda#hip#portability

iree-org/iree

A retargetable MLIR-based machine learning compiler and runtime toolkit for AI/ML developers.

3.6K
Active
C++
ML Ops
Inference
MLIR
#compiler#machine-learning#runtime

ashvardanian/StringZilla

Accelerate string operations in C, C++, Python, Rust, and more with SIMD and GPU-powered algorithms.

3.4K
Stable
C
API Frameworks
Search
C
#string-manipulation#search#sorting

Infatoshi/cuda-course

A CUDA course for developers to learn about GPU computing and parallel processing.

3.3K
Stable
Cuda
React
#cuda#gpu#parallel-processing

patriciogonzalezvivo/lygia

LYGIA is a flexible, multi-language shader library designed for performance, supporting GLSL, HLSL, Metal, WGSL, WEGL, and CUDA.

3.3K
Active
GLSL
Backend Frameworks
Graphics Libraries
OpenGL
#shader#graphics#performance

Jittor/jittor

Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.

3.2K
Active
Python
LLM Frameworks
API Frameworks
#deep-learning#cuda#gpu

HazyResearch/ThunderKittens

A library of tile primitives for building speedy CUDA kernels, useful for vibe coders working on AI tools.

3.2K
Active
Cuda
ML Ops
API Frameworks
#cuda#kernels#performance

thu-ml/SageAttention

Quantized attention that achieves 2-5x speedup over FlashAttention for language, image, and video models.

3.2K
Active
Cuda
Inference
Quantization
PyTorch
#attention#efficient-attention#inference-acceleration

NVIDIA/TransformerEngine

A high-performance Transformer library for accelerating AI models on NVIDIA GPUs, including low-precision support.

3.2K
Active
Python
LLM Frameworks
Inference
PyTorch
#deep-learning#gpu#cuda

NVIDIA/cuda-python

CUDA Python: A library that brings the power of CUDA to Python, enabling high-performance GPU acceleration.

3.2K
Active
Cython
ML Ops
CLI Tools
#gpu-acceleration#high-performance#python-bindings

LeelaChessZero/lc0

Open-source neural network chess engine with GPU acceleration and broad hardware support.

3.0K
Stable
C++
ML Ops
Computer Vision
C++
#chess#chess-ai#neural-networks

pytorch/TensorRT

PyTorch compiler for NVIDIA GPUs using TensorRT, enabling efficient deep learning inference on CUDA hardware.

3.0K
Active
Python
ML Ops
API Frameworks
PyTorch
#deep-learning#cuda#nvidia

NVIDIA/MinkowskiEngine

A high-performance, auto-diff neural network library for 3D and 4D sparse tensor computations.

2.9K
Archived
Python
Computer Vision
ML Ops
PyTorch
#3d-convolutional-network#4d-convolutional-neural-network#sparse-tensor-network

BBuf/how-to-optim-algorithm-in-cuda

This repository provides guidance on optimizing algorithms for CUDA, a framework for parallel computing on NVIDIA GPUs.

2.8K
Active
Cuda
LLM Frameworks
CLI Tools
CUDA
#cuda#optimization#parallel-computing

containers/ramalama

RamaLama simplifies local serving of AI models and enables their use for inference in production via containers.

2.6K
Active
Python
LLM Frameworks
Inference
Python
#ai#containers#inference-server

enpeizhao/CVprojects

A collection of computer vision and AI projects in Python, C++, and embedded systems for developers.

2.6K
Stable
Jupyter Notebook
Computer Vision
Backend Frameworks
Python
#computer-vision#deep-learning#embedded-systems

Stay in the loop

Get weekly updates on trending AI coding tools and projects.