Showing 121-140 of 321 projects
Run GPT model on the browser with WebGPU, a lightweight JavaScript implementation.
A Chinese NLP solution with large models, data, training, and inference capabilities for developers.
A free, open-source Rust inference server compatible with OpenAI-API, suitable for vibe coders
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
High-performance Inference and Deployment Toolkit for LLMs and VLMs
A library for efficient weight quantization of large language models to accelerate inference on edge devices.
High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.
Gluon is a static, type-inferred and embeddable programming language written in Rust for building compilers and language tooling.
Provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio) and example notebooks.
A Rust framework for blazingly fast inference of ML models using zero-knowledge proofs.
A light-hearted yet rigorous approach to learning about impact estimation and causality in Python.
An index of algorithms for learning causality with data, useful for vibe coders working on AI-powered applications.
An open-source codebase for generating high-fidelity podcasts from text using AI models.
Quantized attention that achieves 2-5x speedup over FlashAttention for language, image, and video models.
A high-performance Transformer library for accelerating AI models on NVIDIA GPUs, including low-precision support.
Self Forcing: Bridging Training and Inference in Autoregressive Video Diffusion
Python library for Causal AI and Bayesian networks
A sparsity-aware deep learning inference runtime for CPUs, optimized for performance and efficiency.
Official implementation of a CVPR2020 paper for video-based 3D human pose and shape estimation.
A unified inference and post-training framework for accelerated video generation powered by AI.
Get weekly updates on trending AI coding tools and projects.