Explore Projects

Discover 24 open source projects

Active filters (1):

Search: llamacpp×

Clear all

Showing 1-20 of 24 projects

janhq/jan

Open-source ChatGPT replacement with local AI model support

40.9K

Active

TypeScript

Desktop Model Runners

LLM Wrappers & SDKs

Tauri

#chatgpt#llm#localai

khoj-ai/khoj

VybeGuide.ai discovery platform for vibe coders

33.1K

Stable

Python

LLM Frameworks

Full-Stack Frameworks

Next.js

#khoj#ai#python

llmware-ai/llmware

Unified framework for building enterprise RAG pipelines with small, specialized models

14.9K

Active

Python

Next.js

#LLM Frameworks#RAG Pipelines#Small Specialized Models

getumbrel/llama-gpt

A self-hosted, offline, ChatGPT-like chatbot powered by Llama 2 with no data leaving your device.

11.0K

Archived

TypeScript

LLM Frameworks

TypeScript

#chatgpt#llama#llm

RunanywhereAI/runanywhere-sdks

Production ready AI toolkit for local AI inference

10.2K

Active

Kotlin

AI Coding Tools

#agent-framework#android#apple-intelligence

xorbitsai/inference

Unified, production-ready inference API to run open-source, speech, and multimodal models on cloud, on-prem, or your laptop.

9.1K

Active

Python

LLM Frameworks

Inference

PyTorch

#artificial-intelligence#llm#inference

reorproject/reor

A private & local AI personal knowledge management app for high entropy people with a focus on vibe coders.

8.5K

Experimental

JavaScript

LLM Frameworks

Vector Databases

React

#ai#local-first#note-taking

serge-chat/serge

A web interface for chatting with Alpaca through llama.cpp, with a fully dockerized setup and easy-to-use API.

5.7K

Stable

Svelte

LLM Frameworks

API Frameworks

Svelte

#alpaca#llama#docker

Michael-A-Kuykendall/shimmy

A free, open-source Rust inference server compatible with OpenAI-API, suitable for vibe coders

3.7K

Active

Rust

React

#authentication#inference-server#open-source

twinnydotdev/twinny

A free, open-source AI code completion plugin for Visual Studio Code that rivals GitHub Copilot.

3.6K

Stable

TypeScript

AI Code Editors

LLM Frameworks

TypeScript

#artificial-intelligence#code-completion#code-generation

Josh-XT/AGiXT

AGiXT is a comprehensive AI agent automation platform that streamlines AI integration and task orchestration.

3.2K

Active

Python

Agents & Orchestration

LLM Frameworks

Python

#ai-automation#llm-integration#task-orchestration

SilasMarvin/lsp-ai

An open-source language server that empowers software engineers with AI-powered functionality, not replacing them.

3.1K

Archived

Rust

LLM Frameworks

IDE Extensions

#ai#lsp#llm

janhq/cortex.cpp

A C++ library for building local AI inference platforms with support for ONNX models.

2.8K

Experimental

C++

LLM Frameworks

API Clients & Testing

#onnx#onnxruntime#llm

containers/ramalama

RamaLama simplifies local serving of AI models and enables their use for inference in production via containers.

2.6K

Active

Python

LLM Frameworks

Inference

Python

#ai#containers#inference-server

mostlygeek/llama-swap

Reliable model swapping for local LLM servers - seamlessly switch between llama.cpp, vLLM, and compatible backends

2.6K

Active

Local Inference Engines

LLM Wrappers & SDKs

llama.cpp

#local-llm#model-swapping#llama-cpp

intel/intel-extension-for-transformers

A Python library that provides SOTA compression techniques and efficient LLM inference on Intel platforms to build chatbots quickly.

2.2K

Archived

Python

LLM Frameworks

Inference

Python

#chatbot#llm-inference#compression

alexpinel/Dot

A developer-focused platform for text-to-speech, RAG, and LLMs, with local-first architecture.

1.9K

Archived

JavaScript

LLM Frameworks

RAG & Vector

React

#text-to-speech#rag#llm

alexrozanski/LlamaChat

A native macOS app that allows you to chat with your favorite LLaMA language models.

1.5K

Archived

Swift

LLM Frameworks

Desktop

Swift

#llama#macos#swift

intentee/paddler

Open-source LLM load balancer and serving platform for self-hosting LLMs at scale.

1.5K

Active

Rust

LLM Frameworks

API Frameworks

Rust

#llm#load-balancer#container

RahulSChand/gpu_poor

A JavaScript tool for calculating token/s and GPU memory requirements for large language models like LLaMa.

1.4K

Archived

JavaScript

LLM Frameworks

CLI Tools

#llm#gpu#quantization

Stay in the loop

Get weekly updates on trending AI coding tools and projects.