Showing 21-25 of 25 projects
Efficient computing methods developed by Huawei Noah's Ark Lab for model compression and optimization.
Neural Network Compression Framework for enhanced OpenVINOโข inference
A tool for structurally pruning large language models like LLaMA, BLOOM, and Vicuna to reduce their size and inference time.
A library for accelerating deep neural networks through channel pruning, a model compression technique.
A compression toolkit for YOLOv5 with support for various backbones, modules, and deployment options.
Get weekly updates on trending AI coding tools and projects.