Explore Projects

Discover 37 open source projects

Active filters (1):

Search: llm-inference×

Clear all

Showing 21-37 of 37 projects

predibase/lorax

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

3.7K

Experimental

Python

LLM Frameworks

BaaS Platforms

PyTorch

#llm#fine-tuning#model-serving

neuralmagic/deepsparse

A sparsity-aware deep learning inference runtime for CPUs, optimized for performance and efficiency.

3.2K

Experimental

Python

Inference

API Frameworks

PyTorch

#computer-vision#nlp#object-detection

ruvnet/ruvector

High-performance vector graph neural network database in Rust for real-time AI inference and graph ML.

2.9K

Active

Rust

Inference

Vector Databases

Rust

#vector-database#gnn#graph-neural-networks

b4rtaz/distributed-llama

A C++ library for distributed large language model inference, allowing developers to build powerful AI applications with a cluster of home devices.

2.8K

Active

C++

LLM Frameworks

Containerization

#distributed-computing#llm-inference#llama2

spiceai/spiceai

A portable accelerated SQL query, search, and LLM-inference engine for data-grounded AI apps and agents.

2.8K

Active

Rust

LLM Frameworks

Databases

Rust

#artificial-intelligence#data-federation#full-text-search

FasterDecoding/Medusa

A simple framework for accelerating LLM generation with multiple decoding heads

2.7K

Archived

Jupyter Notebook

LLM Frameworks

Inference

Jupyter Notebook

#llm#generation#inference

SafeAILab/EAGLE

Official implementation of EAGLE, a framework for developing AI-powered coding tools and language models.

2.2K

Active

Python

LLM Frameworks

AI Code Generation

Python

#large-language-models#llm-inference#speculative-decoding

intel/intel-extension-for-transformers

A Python library that provides SOTA compression techniques and efficient LLM inference on Intel platforms to build chatbots quickly.

2.2K

Archived

Python

LLM Frameworks

Inference

Python

#chatbot#llm-inference#compression

liltom-eth/llama2-webui

A tool to run Llama 2 language models locally with a Gradio UI, for building generative AI apps and agents.

1.9K

Archived

Jupyter Notebook

LLM Wrappers & SDKs

AI Coding Agents

Jupyter Notebook

#llama2#llm#generative-ai

neuron-core/neuron-ai

A flexible PHP framework for building production-ready AI-driven applications with modular components like LLMs and vector databases.

1.8K

Active

PHP

Agents & Orchestration

LLM Frameworks

#agent#agentic-ai#llm

beam-cloud/beta9

Ultrafast serverless GPU inference, sandboxes, and background jobs for AI-focused developers.

1.6K

Active

LLM Frameworks

Serverless

#serverless#gpu#inference

Nano-Collective/nanocoder

A community-driven, local-first coding agent that integrates with AI tools like OpenAI and OLLaMA.

1.4K

Active

TypeScript

AI Coding Agents

MCP Frameworks

React

#ai-coding#llm-integration#local-first

sauravpanda/BrowserAI

Run local LLMs like llama, deepseek-distill, kokoro and more inside your browser

1.4K

Active

TypeScript

LLM Frameworks

Frontend Frameworks

React

#llm#local-llm#webgpu

taielab/awesome-hacking-lists

A curated collection of top-tier penetration testing tools and productivity utilities for security researchers and bug bounty hunters.

1.3K

Stable

Penetration Testing

CLI Tools

#hacking#security-research#penetration-testing

lean-dojo/LeanCopilot

LeanCopilot is a C++ library that uses large language models (LLMs) as copilots for theorem proving in the Lean programming language.

1.2K

Active

C++

LLM Frameworks

API Frameworks

#formal-mathematics#lean#theorem-proving

felladrin/awesome-ai-web-search

A curated list of AI-powered web search tools and resources for developers and researchers.

1.2K

Active

HTML

LLM Frameworks

RAG & Vector

React

#ai#search-engine#information-retrieval

character-ai/prompt-poet

Streamlines prompt design for developers and non-technical users with a low-code approach.

1.1K

Stable

Python

LLM Frameworks

Prompt Engineering

Python

#llm#prompt-design#prompt-engineering

Stay in the loop

Get weekly updates on trending AI coding tools and projects.