Showing 121-133 of 133 projects
OpenCL integration for Python, with advanced features for parallel computing and GPU acceleration.
A powerful and extensible machine learning framework for hackers and AI enthusiasts, built in Rust.
Samples for Intel's oneAPI Toolkits, supporting a wide range of AI, HPC, and parallel computing use cases.
Fast Clojure matrix library for high-performance linear algebra and numerical computing on CPU and GPU.
A minimal implementation of the Flash Attention algorithm in CUDA for efficient AI model inference.
A Python library for profiling and inspecting memory usage in PyTorch applications.
A library for accessing CUDA, the parallel computing platform and API, for vibe coders.
A fast CUDA matrix multiplication library from scratch for high-performance computing and AI workloads.
Programmable CUDA/C++ GPU Graph Analytics library for high-performance parallel graph processing.
Safe Rust wrapper around the CUDA toolkit for GPU acceleration in AI/ML applications.
Robotics platform with GPU-accelerated collision detection, point cloud processing, and more.
An open-source PyTorch implementation of Video Frame Interpolation via Adaptive Separable Convolution.
A high-performance, GPU-accelerated multimedia/video processing framework for transcoding, AI inference, and more.
Get weekly updates on trending AI coding tools and projects.