Explore Projects

Discover 380 open source projects

Active filters (1):
Search: data-science×
Clear all

Showing 41-60 of 380 projects

microsoft/nni

An open-source AutoML toolkit for automating the machine learning lifecycle, including feature engineering, neural architecture search, and hyperparameter tuning.

14.3K
Archived
Python
ML Ops
PyTorch
#automated-machine-learning#automl#hyperparameter-tuning

virgili0/Virgilio

A comprehensive learning resource for data science and machine learning, covering a wide range of topics and tools.

14.3K
Stable
Jupyter Notebook
Tutorials & Courses
#data-science#machine-learning#ai

oxnr/awesome-bigdata

A curated list of awesome big data frameworks, resources and other awesomeness.

14.3K
Stable
Databases
#big-data#data-analytics#data-science

mwaskom/seaborn

Statistical data visualization in Python

13.8K
Active
Python
React
#data-visualization#statistical-analysis#python-library

visenger/awesome-mlops

A curated list of references for MLOps, the practice of managing and automating machine learning workflows.

13.8K
Archived
ML Ops
#machine-learning#devops#data-science

Data-Centric-AI-Community/ydata-profiling

A Python library for fast, customizable, and interactive data profiling and exploratory data analysis.

13.4K
Active
Python
Data Profiling
Python
#data-profiling#exploratory-data-analysis#data-quality

jpmorganchase/python-training

This repository provides Python training materials for business analysts and traders in the finance industry.

12.8K
Archived
Jupyter Notebook
API Frameworks
#banking#finance#data-science

tangyudi/Ai-Learn

Comprehensive learning roadmap for AI & machine learning, with 200+ practical cases and projects for beginners to experts.

12.7K
Archived
Learning & Education
#machine-learning#deep-learning#computer-vision

trinodb/trino

Trino is a distributed SQL query engine for big data, allowing fast, scalable, and cost-effective analytics.

12.6K
Active
Java
Databases
#big-data#analytics#data-science

rasbt/python-machine-learning-book

A comprehensive code repository and resource for learning machine learning with Python.

12.6K
Archived
Jupyter Notebook
Machine Learning Algorithms
Python
#machine-learning#data-mining#data-science

dair-ai/ML-Papers-of-the-Week

A weekly curated list of top machine learning research papers, useful for vibe coders.

12.2K
Experimental
LLM Frameworks
#machine-learning#deep-learning#research-papers

allenai/allennlp

An open-source NLP research library built on PyTorch for advanced natural language processing tasks.

11.9K
Archived
Python
LLM Frameworks
PyTorch
#natural-language-processing#deep-learning#research

OpenRefine/OpenRefine

OpenRefine is a powerful data cleaning and transformation tool that helps developers work with messy data.

11.8K
Active
Java
Data Cleaning & Transformation
Java
#data-analysis#data-wrangling#data-cleaning

ludwig-ai/ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models

11.7K
Active
Python
LLM Frameworks
PyTorch
#llm#neural-network#machine-learning

microsoft/RD-Agent

An R&D agent that automates high-value generic R&D processes to let AI drive data-driven AI.

11.5K
Active
Python
Agents & Orchestration
Python
#ai#automation#data-mining

cleanlab/cleanlab

An open-source library for data-centric AI with tools for data quality and machine learning on messy, real-world data.

11.4K
Active
Python
Data Quality
Python
#data-centric-ai#data-quality#data-cleaning

statsmodels/statsmodels

Statsmodels is a Python library for statistical modeling and econometrics, providing tools for data analysis and prediction.

11.3K
Active
Python
Data Science
Python
#data-analysis#statistics#econometrics

great-expectations/great_expectations

A Python library that helps ensure data quality and reliability through data profiling and testing.

11.2K
Active
Python
ETL & Pipelines
#data-quality#data-testing#data-profiling

aws/amazon-sagemaker-examples

A collection of Jupyter notebooks showcasing how to build and deploy machine learning models with Amazon SageMaker.

10.9K
Active
Jupyter Notebook
ML Ops
Jupyter Notebook
#machine-learning#deep-learning#data-science

wandb/wandb

The AI developer platform to train and fine-tune models, and manage models from experimentation to production.

10.9K
Active
Python
ML Ops
PyTorch
#ai#machine-learning#model-versioning
124...19

Stay in the loop

Get weekly updates on trending AI coding tools and projects.