Showing 1-20 of 37 projects
Run local LLMs on any device with GPT4All
Ray is a unified framework for scaling AI and Python applications with distributed computing and ML libraries.
Detect secrets in git repos and files
Comprehensive LLM engineering and application resources with training, inference, compression, and deployment guides
A collection of high-performance large language models (LLMs) with recipes to pretrain, finetune, and deploy at scale.
Deploy open-source LLMs as OpenAI-compatible API endpoints using BentoML's model serving framework.
Official inference library for Mistral models, a platform for building AI-powered applications.
OpenVINO is an open-source toolkit for optimizing and deploying AI inference on a variety of hardware.
High-performance C++ library for fast local deployment of large language models (LLMs) like LLaMA.
BentoML is an easy-to-use framework for building and deploying production-ready machine learning models as APIs.
LMDeploy is a toolkit for compressing, deploying, and serving large language models (LLMs).
Delivers infrastructure for agentic apps with AI-native proxy and data plane.
Open-source implementation of AlphaEvolve, a coding agent for iterative code optimization and discovery.
Superduper is an end-to-end framework for building custom AI applications and agents using Python, PyTorch, and Transformers.
A Python library for serving large language models (LLMs) with high performance, including GPU acceleration and distributed inference.
A curated list of awesome papers and code for optimizing LLM/VLM inference performance
Eko is an agentic framework that helps developers build production-ready AI-powered workflows with natural language interactions.
Optimize AI inference performance on GPUs with this Python library for selecting and tuning inference engines.
Collection of generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.
A free, open-source Rust inference server compatible with OpenAI-API, suitable for vibe coders
Get weekly updates on trending AI coding tools and projects.