Explore Projects

Discover 46 open source projects

Active filters (1):

Search: quantization×

Clear all

Showing 21-40 of 46 projects

quic/aimet

AIMET is an open-source library for advanced quantization and compression techniques in trained neural network models.

2.6K

Active

Python

React

#quantization#compression#neural-networks

Efficient-ML/Awesome-Model-Quantization

A comprehensive collection of resources for model quantization research and optimization.

2.3K

Active

Model Optimization

Documentation

#model-quantization#deep-learning#efficient-deep-learning

dvmazur/mixtral-offloading

Run Mixtral-8x7B language models on Colab or consumer desktops with offloading capabilities.

2.3K

Archived

Python

LLM Frameworks

BaaS Platforms

PyTorch

#language-model#offloading#quantization

casper-hansen/AutoAWQ

Implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference, for AI coding tools.

2.3K

Experimental

Python

Inference

AI Code Generation

Python

#quantization#speedup#inference

adithya-s-k/AI-Engineering.academy

A collection of Jupyter Notebooks on various AI and machine learning concepts, including fine-tuning, inference, and LLMs.

2.1K

Stable

Jupyter Notebook

LLM Frameworks

Fine-tuning

Python

#fine-tuning#large-language-models#inference

NVIDIA/Model-Optimizer

A Python library for optimizing deep learning models for faster inference on deployment platforms like TensorRT.

2.1K

Active

Python

Inference

CLI Tools

#deep-learning#model-optimization#quantization

horseee/Awesome-Efficient-LLM

A curated list of efficient and compressed large language models for developers to explore.

2.0K

Experimental

Python

LLM Frameworks

LLM Compression

#compression#efficient-llm#knowledge-distillation

OpenPPL/ppq

PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.

1.8K

Archived

Python

Inference

API Frameworks

PyTorch

#neural-network#quantization#deep-learning

descriptinc/descript-audio-codec

State-of-the-art audio codec with 90x compression factor for developers.

1.7K

Active

Python

PyTorch

#audio-compression#deep-learning#gans

open-mmlab/mmrazor

An open-source toolbox and benchmark for model compression and acceleration in PyTorch.

1.7K

Archived

Python

ML Ops

API Frameworks

PyTorch

#model-compression#model-acceleration#benchmark

mit-han-lab/smoothquant

SmoothQuant is an efficient post-training quantization tool for large language models, enabling accurate and fast inference.

1.6K

Archived

Python

LLM Frameworks

Inference

Python

#quantization#large-language-models#performance-optimization

PaddlePaddle/PaddleSlim

PaddleSlim is an open-source library for deep model compression and architecture search.

1.6K

Active

Python

Inference

ML Ops

PyTorch

#compression#architecture-search#model-optimization

JustGlowing/minisom

A minimalistic implementation of the Self Organizing Maps (SOM) algorithm for clustering and dimensionality reduction.

1.6K

Active

Python

Clustering

Dimensionality Reduction

Python

#clustering#dimensionality-reduction#unsupervised-learning

tensorflow/model-optimization

A toolkit to optimize machine learning models for deployment, including quantization and pruning.

1.6K

Active

Python

ML Ops

API Frameworks

TensorFlow

#machine-learning#model-optimization#quantization

RWKV/rwkv.cpp

An efficient C++ implementation of the RWKV language model for fast CPU inference on various bit-width quantizations.

1.6K

Experimental

C++

LLM Frameworks

Inference

#language-model#llm#quantization

Xilinx/brevitas

Brevitas is a PyTorch library for neural network quantization, enabling efficient hardware acceleration on FPGAs and other devices.

1.5K

Active

Python

ML Ops

FPGA

PyTorch

#deep-learning#neural-networks#quantization

RahulSChand/gpu_poor

A JavaScript tool for calculating token/s and GPU memory requirements for large language models like LLaMa.

1.4K

Archived

JavaScript

LLM Frameworks

CLI Tools

#llm#gpu#quantization

Vahe1994/AQLM

Official PyTorch repository for extreme compression of large language models using additive quantization and PV-Tuning.

1.3K

Stable

Python

LLM Frameworks

LLM Wrappers & SDKs

PyTorch

#large-language-models#compression#quantization

huawei-noah/Efficient-Computing

Efficient computing methods developed by Huawei Noah's Ark Lab for model compression and optimization.

1.3K

Archived

Jupyter Notebook

Model Compression

Quantization

#model-compression#quantization#pruning

hengxuZ/binance-quantization

This Python project provides a cryptocurrency trading system for the Binance exchange, using grid trading strategies.

1.3K

Archived

Python

API Frameworks

Backend Frameworks

Python

#cryptocurrency#trading#binance

Stay in the loop

Get weekly updates on trending AI coding tools and projects.