Showing 1-20 of 154 projects
DeepSeek-V3 is a large-scale MoE language model with 671B parameters, optimized for efficiency and performance.
DeepSeek-R1 is a reasoning model series with open-source versions and distillations for enhanced performance in math, code, and reasoning tasks.
Comprehensive LLM course with roadmaps, Colab notebooks, and tools for building and deploying LLM applications.
Fine-tuning framework for 100+ LLMs & VLMs
Deep learning paper implementations with side-by-side notes and explanations
Simplified training and fine-tuning for medium-sized GPT models
Fine-tuning & RL for LLMs with optimized performance and memory use
Train LLMs on a single GPU for minimal cost
Train a 26M GPT from scratch in 2h
BERT - Pre-trained NLP models and TensorFlow code for natural language understanding
Open platform for training, serving, and evaluating LLM chatbots with Vicuna and Chatbot Arena
Course on building a Storyteller AI LLM from scratch in Python, C, and CUDA
ControlNet adds conditional control to diffusion models for precise image generation.
Train and fine-tune Alpaca, an instruction-following LLaMA model with 52K data points.
Deep learning tuning playbook for maximizing model performance
Comprehensive guide for Chinese developers to deploy and fine-tune open-source LLMs on Linux
Open R1 is a fully open reproduction of DeepSeek-R1, enabling replication and extension of its reasoning capabilities.
MiniGPT-4 and MiniGPT-v2 for vision-language tasks
南瓜书:《机器学习》(西瓜书)公式详解
Comprehensive LLM engineering and application resources with training, inference, compression, and deployment guides
Get weekly updates on trending AI coding tools and projects.