Showing 1-10 of 10 projects
DeepSpeed optimizes deep learning training and inference with distributed computing techniques.
A powerful mixture-of-experts vision-language model for advanced multimodal understanding.
DeepSeek-V2 is a powerful, efficient, and economical mixture-of-experts language model for vibe coders building with AI tools.
Run Mixtral-8x7B language models on Colab or consumer desktops with offloading capabilities.
DeepSeekMoE is a novel Mixture-of-Experts language model framework for building highly specialized AI coding assistants.
A curated collection of resources for mixture-of-experts models, a powerful AI technique.
PyTorch implementation of the Sparsely-Gated Mixture-of-Experts layer, a powerful AI model architecture.
Kimi-VL is a multimodal AI model for advanced vision-language understanding and reasoning.
Aria is an open-source multimodal AI framework for building vision and language models.
A library for building Mixture-of-Experts (MoE) models from the LLaMA language model with continual pre-training.
Get weekly updates on trending AI coding tools and projects.