Explore Projects

Discover 15 open source projects

Active filters (1):
Search: distributed-trainingร—
Clear all

Showing 1-15 of 15 projects

GokuMohandas/Made-With-ML

Learn to build production-grade ML applications with code and best practices

46.6K
Archived
Jupyter Notebook
ML Ops
Tutorials & Courses
Jupyter Notebook
#machine-learning#mlops#data-science

huggingface/pytorch-image-models

A collection of PyTorch image encoders/backbones with training, evaluation, and inference scripts.

36.4K
Active
Python
LLM Frameworks
Full-Stack Frameworks
Next.js
#PyTorch#Image Models#Deep Learning

PaddlePaddle/Paddle

PaddlePaddle is a deep learning framework for industrial-scale ML, offering distributed training, model deployment, and Python/C++ support.

23.7K
Active
C++
ML Ops
PaddlePaddle
#deep-learning#distributed-training#machine-learning

PaddlePaddle/PaddleNLP

Easy-to-use and powerful LLM and SLM library with awesome model zoo for natural language processing.

12.9K
Stable
Python
LLM Frameworks
React
#llm#nlp#transformers

Netflix/metaflow

Build, manage and deploy AI/ML systems with Metaflow

9.9K
Active
Python
Next.js
#metaflow#ai#ml

skypilot-org/skypilot

Easily run, manage, and scale AI workloads on any infrastructure using a unified platform.

9.5K
Active
Python
ML Ops
Python
#cloud-computing#cloud-management#cost-optimization

FedML-AI/FedML

A unified and scalable ML library for large-scale distributed training, model serving, and federated learning.

4.0K
Stable
Python
ML Ops
Inference
React
#ai#machine-learning#federated-learning

bytedance/byteps

A high-performance and generic framework for distributed deep neural network training

3.7K
Archived
Python
ML Ops
API Frameworks
Keras
#distributed-training#deep-learning#machine-learning

tensorflow/adanet

A flexible AutoML framework with learning guarantees for building high-performance AI models.

3.5K
Archived
Jupyter Notebook
ML Ops
API Frameworks
TensorFlow
#automl#deep-learning#machine-learning

determined-ai/determined

Determined is an open-source machine learning platform for distributed training, hyperparameter tuning, and resource management.

3.2K
Experimental
Go
PyTorch
#machine-learning#open-source#distributed-training

alpa-projects/alpa

Alpa is a distributed training and serving framework for large-scale neural networks with auto-parallelization.

3.2K
Archived
Python
LLM Frameworks
API Frameworks
JAX
#distributed-computing#high-performance-computing#auto-parallelization

intelligent-machine-learning/dlrover

DLRover is an automatic distributed deep learning system for training large-scale AI models on Kubernetes.

1.6K
Active
Python
ML Ops
Containerization
Python
#distributed-training#kubernetes#large-scale-ai

pytorch/gloo

A collective communications library with various primitives for multi-machine training.

1.4K
Active
C++
ML Ops
API Frameworks
#collectives#distributed-training#pytorch

tensorlayer/HyperPose

Fast and flexible human pose estimation library for computer vision applications

1.3K
Archived
Python
TensorFlow
#pose estimation#computer vision#machine learning

DeepRec-AI/DeepRec

DeepRec is a high-performance deep learning recommendation framework based on TensorFlow, hosted in the LF AI & Data Foundation.

1.2K
Archived
C++
Recommendation Engine
API Frameworks
TensorFlow
#deep-learning#recommendation-engine#scalability

Stay in the loop

Get weekly updates on trending AI coding tools and projects.