Showing 21-31 of 31 projects
LLMs-based Operators and Pipelines for data prep
RamaLama simplifies local serving of AI models and enables their use for inference in production via containers.
Reliable model swapping for local LLM servers - seamlessly switch between llama.cpp, vLLM, and compatible backends
A Python library for optimizing deep learning models for faster inference on deployment platforms like TensorRT.
A LangChain-based framework for end-to-end natural language to data insight conversion, with MCP Skills multi-agent architecture.
A community-maintained hardware plugin for running large language models (LLMs) on Ascend accelerators.
Enterprise-grade API gateway for monitoring and managing costs/rates across LLMs like OpenAI, Anthropic, and Azure OpenAI.
An AI inference operator for Kubernetes that makes it easy to serve ML models in production.
Lumina-mGPT 2.0 is a stand-alone autoregressive image modeling tool powered by Python.
Adds support for very large language models (vLLMs) to IndexTTS, enabling faster AI-powered text-to-speech inference.
A Python library to evaluate the response of large language models like GPT-4 using Prometheus metrics.
Get weekly updates on trending AI coding tools and projects.