Explore Projects

Discover 4 open source projects

Active filters (1):
Search: 4-bitsร—
Clear all

Showing 1-4 of 4 projects

nunchaku-ai/nunchaku

An open-source library for quantizing diffusion models to 4-bit precision, absorbing outliers through low-rank components.

3.7K
Active
Python
Diffusion Models
Quantization
PyTorch
#diffusion-models#quantization#mlops

NVIDIA/TransformerEngine

A high-performance Transformer library for accelerating AI models on NVIDIA GPUs, including low-precision support.

3.2K
Active
Python
LLM Frameworks
Inference
PyTorch
#deep-learning#gpu#cuda

casper-hansen/AutoAWQ

Implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference, for AI coding tools.

2.3K
Experimental
Python
Inference
AI Code Generation
Python
#quantization#speedup#inference

intel/intel-extension-for-transformers

A Python library that provides SOTA compression techniques and efficient LLM inference on Intel platforms to build chatbots quickly.

2.2K
Archived
Python
LLM Frameworks
Inference
Python
#chatbot#llm-inference#compression

Stay in the loop

Get weekly updates on trending AI coding tools and projects.