Showing 1-20 of 46 projects
Fine-tuning framework for 100+ LLMs & VLMs
Faster Whisper transcription with CTranslate2 for efficient speech-to-text
Open-source Chinese LLaMA and Alpaca large language models for local CPU/GPU training and deployment.
An AI-powered quantitative investment research platform for algorithmic trading and backtesting.
Efficient finetuning of quantized LLMs for AI developers
Accessible large language models via k-bit quantization for PyTorch developers working with AI tools.
A high-performance PNG compressor library and CLI tool for reducing the file size of PNG images.
Fast C++ inference engine for Transformer models, supporting CUDA, MKL, and other optimizations.
An open-source library for quantizing diffusion models to 4-bit precision, absorbing outliers through low-rank components.
EasyQuant is a Python-based stock quantization framework for real-time market data and trading.
A library for efficient weight quantization of large language models to accelerate inference on edge devices.
Provides GGUF quantization support for native ComfyUI models
Quantized attention that achieves 2-5x speedup over FlashAttention for language, image, and video models.
A sparsity-aware deep learning inference runtime for CPUs, optimized for performance and efficiency.
Pretrained language model and optimization techniques for large-scale distributed AI/ML development.
A model library for exploring state-of-the-art deep learning techniques for optimizing NLP neural networks.
A memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
ComfyUI Plugin of Nunchaku, a developer tool for AI-powered coding and rapid prototyping.
A PyTorch repository providing pre-trained models and datasets for common computer vision tasks.
Optimizes large language models for low-bit precision and sparsity, improving model compression techniques.
Get weekly updates on trending AI coding tools and projects.