Explore Projects

Discover 5 open source projects

Active filters (1):
Search: tensorrt-llmร—
Clear all

Showing 1-5 of 5 projects

NVIDIA/TensorRT-LLM

TensorRT LLM provides a Python API and optimizations to efficiently run large language models on NVIDIA GPUs.

13.0K
Active
Python
LLM Frameworks
PyTorch
#cuda#llm-serving#moe

xlite-dev/Awesome-LLM-Inference

A curated list of awesome papers and code for optimizing LLM/VLM inference performance

5.0K
Active
Python
LLM Frameworks
LLM Wrappers & SDKs
#llm#inference#optimization

collabora/WhisperLive

A nearly-live implementation of OpenAI's Whisper, a powerful speech recognition and translation tool.

3.9K
Active
Python
LLM Frameworks
AI Voice & Speech
Python
#whisper#speech-recognition#translation

NVIDIA/ChatRTX

A developer reference project for creating Retrieval Augmented Generation (RAG) chatbots on Windows using TensorRT-LLM.

3.1K
Active
Python
LLM Frameworks
RAG & Vector
Python
#chatbots#retrieval-augmented-generation#tensorrt

NVIDIA/Model-Optimizer

A Python library for optimizing deep learning models for faster inference on deployment platforms like TensorRT.

2.1K
Active
Python
Inference
CLI Tools
#deep-learning#model-optimization#quantization

Stay in the loop

Get weekly updates on trending AI coding tools and projects.