Showing 21-37 of 37 projects
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
A sparsity-aware deep learning inference runtime for CPUs, optimized for performance and efficiency.
High-performance vector graph neural network database in Rust for real-time AI inference and graph ML.
A C++ library for distributed large language model inference, allowing developers to build powerful AI applications with a cluster of home devices.
A portable accelerated SQL query, search, and LLM-inference engine for data-grounded AI apps and agents.
A simple framework for accelerating LLM generation with multiple decoding heads
Official implementation of EAGLE, a framework for developing AI-powered coding tools and language models.
A Python library that provides SOTA compression techniques and efficient LLM inference on Intel platforms to build chatbots quickly.
A tool to run Llama 2 language models locally with a Gradio UI, for building generative AI apps and agents.
A flexible PHP framework for building production-ready AI-driven applications with modular components like LLMs and vector databases.
Ultrafast serverless GPU inference, sandboxes, and background jobs for AI-focused developers.
A community-driven, local-first coding agent that integrates with AI tools like OpenAI and OLLaMA.
Run local LLMs like llama, deepseek-distill, kokoro and more inside your browser
A curated collection of top-tier penetration testing tools and productivity utilities for security researchers and bug bounty hunters.
LeanCopilot is a C++ library that uses large language models (LLMs) as copilots for theorem proving in the Lean programming language.
A curated list of AI-powered web search tools and resources for developers and researchers.
Streamlines prompt design for developers and non-technical users with a low-code approach.
Get weekly updates on trending AI coding tools and projects.