Showing 101-120 of 321 projects
An open-source cloud-native AI platform for ML/DL workflows, model serving, and distributed training.
An easy-to-use PyTorch to TensorRT converter for optimizing AI model inference on NVIDIA Jetson devices.
Edward is a probabilistic programming language in TensorFlow for deep generative models and variational inference.
AITemplate is a Python framework for rendering neural networks into high-performance CUDA/HIP C++ code, optimized for GPU inference.
Infrastructure components for cost-efficient GenAI model inference and deployment
Optimize AI inference performance on GPUs with this Python library for selecting and tuning inference engines.
A blazing fast inference solution for text embeddings models built with Rust.
A toolkit for automated causal inference and econometric analysis, combining machine learning and econometrics.
Smart LLM router for AI inference cost optimization
Coz is a causal profiler for C/C++ that helps developers optimize performance by identifying bottlenecks.
A fast inference library for running large language models (LLMs) locally on modern GPUs
Efficient AI model backbones developed by Huawei's Noah's Ark Lab, including GhostNet, TNT, and MLP.
A collection of pre-trained deep learning models and demos optimized for high performance using the OpenVINO toolkit.
Fast C++ inference engine for Transformer models, supporting CUDA, MKL, and other optimizations.
Python inference and LoRA trainer package for LTX-2 model
A unified and scalable ML library for large-scale distributed training, model serving, and federated learning.
LightLLM is a high-performance, scalable Python-based framework for inference and serving of large language models.
A course on building a tiny vLLM (virtual Large Language Model) and Qwen inference serving on Apple Silicon for systems engineers.
Collection of generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.
A minimal Python framework for building custom AI inference servers with full control over logic, batching, and scaling.
Get weekly updates on trending AI coding tools and projects.