Explore Projects

Discover 4 open source projects

Active filters (1):

Search: 4-bits×

Showing 1-4 of 4 projects

An open-source library for quantizing diffusion models to 4-bit precision, absorbing outliers through low-rank components.

3.7K

Active

Python

Diffusion Models

Quantization

PyTorch

#diffusion-models#quantization#mlops

A high-performance Transformer library for accelerating AI models on NVIDIA GPUs, including low-precision support.

3.2K

Active

Python

LLM Frameworks

Inference

PyTorch

#deep-learning#gpu#cuda

Implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference, for AI coding tools.

2.3K

Experimental

Python

Inference

AI Code Generation

Python

#quantization#speedup#inference

A Python library that provides SOTA compression techniques and efficient LLM inference on Intel platforms to build chatbots quickly.

2.2K

Archived

Python

LLM Frameworks

Inference

Python

#chatbot#llm-inference#compression

Get weekly updates on trending AI coding tools and projects.