Explore Projects

Discover 46 open source projects

Active filters (1):
Search: quantizationร—
Clear all

Showing 21-40 of 46 projects

quic/aimet

AIMET is an open-source library for advanced quantization and compression techniques in trained neural network models.

2.6K
Active
Python
React
#quantization#compression#neural-networks

Efficient-ML/Awesome-Model-Quantization

A comprehensive collection of resources for model quantization research and optimization.

2.3K
Active
Model Optimization
Documentation
#model-quantization#deep-learning#efficient-deep-learning

dvmazur/mixtral-offloading

Run Mixtral-8x7B language models on Colab or consumer desktops with offloading capabilities.

2.3K
Archived
Python
LLM Frameworks
BaaS Platforms
PyTorch
#language-model#offloading#quantization

casper-hansen/AutoAWQ

Implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference, for AI coding tools.

2.3K
Experimental
Python
Inference
AI Code Generation
Python
#quantization#speedup#inference

adithya-s-k/AI-Engineering.academy

A collection of Jupyter Notebooks on various AI and machine learning concepts, including fine-tuning, inference, and LLMs.

2.1K
Stable
Jupyter Notebook
LLM Frameworks
Fine-tuning
Python
#fine-tuning#large-language-models#inference

NVIDIA/Model-Optimizer

A Python library for optimizing deep learning models for faster inference on deployment platforms like TensorRT.

2.1K
Active
Python
Inference
CLI Tools
#deep-learning#model-optimization#quantization

horseee/Awesome-Efficient-LLM

A curated list of efficient and compressed large language models for developers to explore.

2.0K
Experimental
Python
LLM Frameworks
LLM Compression
#compression#efficient-llm#knowledge-distillation

OpenPPL/ppq

PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.

1.8K
Archived
Python
Inference
API Frameworks
PyTorch
#neural-network#quantization#deep-learning

descriptinc/descript-audio-codec

State-of-the-art audio codec with 90x compression factor for developers.

1.7K
Active
Python
PyTorch
#audio-compression#deep-learning#gans

open-mmlab/mmrazor

An open-source toolbox and benchmark for model compression and acceleration in PyTorch.

1.7K
Archived
Python
ML Ops
API Frameworks
PyTorch
#model-compression#model-acceleration#benchmark

mit-han-lab/smoothquant

SmoothQuant is an efficient post-training quantization tool for large language models, enabling accurate and fast inference.

1.6K
Archived
Python
LLM Frameworks
Inference
Python
#quantization#large-language-models#performance-optimization

PaddlePaddle/PaddleSlim

PaddleSlim is an open-source library for deep model compression and architecture search.

1.6K
Active
Python
Inference
ML Ops
PyTorch
#compression#architecture-search#model-optimization

JustGlowing/minisom

A minimalistic implementation of the Self Organizing Maps (SOM) algorithm for clustering and dimensionality reduction.

1.6K
Active
Python
Clustering
Dimensionality Reduction
Python
#clustering#dimensionality-reduction#unsupervised-learning

tensorflow/model-optimization

A toolkit to optimize machine learning models for deployment, including quantization and pruning.

1.6K
Active
Python
ML Ops
API Frameworks
TensorFlow
#machine-learning#model-optimization#quantization

RWKV/rwkv.cpp

An efficient C++ implementation of the RWKV language model for fast CPU inference on various bit-width quantizations.

1.6K
Experimental
C++
LLM Frameworks
Inference
#language-model#llm#quantization

Xilinx/brevitas

Brevitas is a PyTorch library for neural network quantization, enabling efficient hardware acceleration on FPGAs and other devices.

1.5K
Active
Python
ML Ops
FPGA
PyTorch
#deep-learning#neural-networks#quantization

RahulSChand/gpu_poor

A JavaScript tool for calculating token/s and GPU memory requirements for large language models like LLaMa.

1.4K
Archived
JavaScript
LLM Frameworks
CLI Tools
#llm#gpu#quantization

Vahe1994/AQLM

Official PyTorch repository for extreme compression of large language models using additive quantization and PV-Tuning.

1.3K
Stable
Python
LLM Frameworks
LLM Wrappers & SDKs
PyTorch
#large-language-models#compression#quantization

huawei-noah/Efficient-Computing

Efficient computing methods developed by Huawei Noah's Ark Lab for model compression and optimization.

1.3K
Archived
Jupyter Notebook
Model Compression
Quantization
#model-compression#quantization#pruning

hengxuZ/binance-quantization

This Python project provides a cryptocurrency trading system for the Binance exchange, using grid trading strategies.

1.3K
Archived
Python
API Frameworks
Backend Frameworks
Python
#cryptocurrency#trading#binance

Stay in the loop

Get weekly updates on trending AI coding tools and projects.