Explore Projects

Discover 5 open source projects

Active filters (1):

Search: tensorrt-llm×

Clear all

Showing 1-5 of 5 projects

NVIDIA/TensorRT-LLM

TensorRT LLM provides a Python API and optimizations to efficiently run large language models on NVIDIA GPUs.

13.0K

Active

Python

LLM Frameworks

PyTorch

#cuda#llm-serving#moe

xlite-dev/Awesome-LLM-Inference

A curated list of awesome papers and code for optimizing LLM/VLM inference performance

5.0K

Active

Python

LLM Frameworks

LLM Wrappers & SDKs

#llm#inference#optimization

collabora/WhisperLive

A nearly-live implementation of OpenAI's Whisper, a powerful speech recognition and translation tool.

3.9K

Active

Python

LLM Frameworks

AI Voice & Speech

Python

#whisper#speech-recognition#translation

NVIDIA/ChatRTX

A developer reference project for creating Retrieval Augmented Generation (RAG) chatbots on Windows using TensorRT-LLM.

3.1K

Active

Python

LLM Frameworks

RAG & Vector

Python

#chatbots#retrieval-augmented-generation#tensorrt

NVIDIA/Model-Optimizer

A Python library for optimizing deep learning models for faster inference on deployment platforms like TensorRT.

2.1K

Active

Python

Inference

CLI Tools

#deep-learning#model-optimization#quantization

Stay in the loop

Get weekly updates on trending AI coding tools and projects.