Showing 141-160 of 321 projects
PyTorch compiler for NVIDIA GPUs using TensorRT, enabling efficient deep learning inference on CUDA hardware.
A distributed inference engine for large language models and StableDiffusion on mobile, desktop and server.
A Python framework for efficient model inference with omni-modality AI models.
A general plug-and-play inference library for Recursive Language Models (RLMs) supporting various sandboxes.
High-performance vector graph neural network database in Rust for real-time AI inference and graph ML.
A high-performance, auto-diff neural network library for 3D and 4D sparse tensor computations.
A C++ library for distributed large language model inference, allowing developers to build powerful AI applications with a cluster of home devices.
A portable accelerated SQL query, search, and LLM-inference engine for data-grounded AI apps and agents.
A lightweight, self-contained Rust library for running Tensorflow and ONNX models with no dependencies
A C++ library for building local AI inference platforms with support for ONNX models.
An open-source C++ library for Bayesian inference and data analysis using Markov Chain Monte Carlo (MCMC) methods.
A simple framework for accelerating LLM generation with multiple decoding heads
This repository provides an AI-powered toolkit for developing applications with the Rockchip RKNN inference engine.
An efficient multi-scale object detection training and inference algorithm for computer vision tasks.
This repository provides code and models for running inference with the SAM 3D Body Model, a tool for 3D body reconstruction.
An AI-powered tool for training supervised models without manual labeling, using foundation models and multimodal learning.
RamaLama simplifies local serving of AI models and enables their use for inference in production via containers.
A probabilistic programming library powered by NumPy and JAX for Bayesian inference and MCMC sampling.
Achieve state-of-the-art inference performance on modern accelerators with this Kubernetes-based solution.
A scalable inference engine for diffusion transformers with massive parallelism
Get weekly updates on trending AI coding tools and projects.