Explore Projects

Discover 37 open source projects

Active filters (1):
Search: llm-inferenceร—
Clear all

Showing 21-37 of 37 projects

predibase/lorax

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

3.7K
Experimental
Python
LLM Frameworks
BaaS Platforms
PyTorch
#llm#fine-tuning#model-serving

neuralmagic/deepsparse

A sparsity-aware deep learning inference runtime for CPUs, optimized for performance and efficiency.

3.2K
Experimental
Python
Inference
API Frameworks
PyTorch
#computer-vision#nlp#object-detection

ruvnet/ruvector

High-performance vector graph neural network database in Rust for real-time AI inference and graph ML.

2.9K
Active
Rust
Inference
Vector Databases
Rust
#vector-database#gnn#graph-neural-networks

b4rtaz/distributed-llama

A C++ library for distributed large language model inference, allowing developers to build powerful AI applications with a cluster of home devices.

2.8K
Active
C++
LLM Frameworks
Containerization
#distributed-computing#llm-inference#llama2

spiceai/spiceai

A portable accelerated SQL query, search, and LLM-inference engine for data-grounded AI apps and agents.

2.8K
Active
Rust
LLM Frameworks
Databases
Rust
#artificial-intelligence#data-federation#full-text-search

FasterDecoding/Medusa

A simple framework for accelerating LLM generation with multiple decoding heads

2.7K
Archived
Jupyter Notebook
LLM Frameworks
Inference
Jupyter Notebook
#llm#generation#inference

SafeAILab/EAGLE

Official implementation of EAGLE, a framework for developing AI-powered coding tools and language models.

2.2K
Active
Python
LLM Frameworks
AI Code Generation
Python
#large-language-models#llm-inference#speculative-decoding

intel/intel-extension-for-transformers

A Python library that provides SOTA compression techniques and efficient LLM inference on Intel platforms to build chatbots quickly.

2.2K
Archived
Python
LLM Frameworks
Inference
Python
#chatbot#llm-inference#compression

liltom-eth/llama2-webui

A tool to run Llama 2 language models locally with a Gradio UI, for building generative AI apps and agents.

1.9K
Archived
Jupyter Notebook
LLM Wrappers & SDKs
AI Coding Agents
Jupyter Notebook
#llama2#llm#generative-ai

neuron-core/neuron-ai

A flexible PHP framework for building production-ready AI-driven applications with modular components like LLMs and vector databases.

1.8K
Active
PHP
Agents & Orchestration
LLM Frameworks
#agent#agentic-ai#llm

beam-cloud/beta9

Ultrafast serverless GPU inference, sandboxes, and background jobs for AI-focused developers.

1.6K
Active
Go
LLM Frameworks
Serverless
Go
#serverless#gpu#inference

Nano-Collective/nanocoder

A community-driven, local-first coding agent that integrates with AI tools like OpenAI and OLLaMA.

1.4K
Active
TypeScript
AI Coding Agents
MCP Frameworks
React
#ai-coding#llm-integration#local-first

sauravpanda/BrowserAI

Run local LLMs like llama, deepseek-distill, kokoro and more inside your browser

1.4K
Active
TypeScript
LLM Frameworks
Frontend Frameworks
React
#llm#local-llm#webgpu

taielab/awesome-hacking-lists

A curated collection of top-tier penetration testing tools and productivity utilities for security researchers and bug bounty hunters.

1.3K
Stable
Penetration Testing
CLI Tools
#hacking#security-research#penetration-testing

lean-dojo/LeanCopilot

LeanCopilot is a C++ library that uses large language models (LLMs) as copilots for theorem proving in the Lean programming language.

1.2K
Active
C++
LLM Frameworks
API Frameworks
#formal-mathematics#lean#theorem-proving

felladrin/awesome-ai-web-search

A curated list of AI-powered web search tools and resources for developers and researchers.

1.2K
Active
HTML
LLM Frameworks
RAG & Vector
React
#ai#search-engine#information-retrieval

character-ai/prompt-poet

Streamlines prompt design for developers and non-technical users with a low-code approach.

1.1K
Stable
Python
LLM Frameworks
Prompt Engineering
Python
#llm#prompt-design#prompt-engineering
1

Stay in the loop

Get weekly updates on trending AI coding tools and projects.