Explore Projects

Discover 380 open source projects

Active filters (1):
Search: data-scienceร—
Clear all

Showing 81-100 of 380 projects

pymupdf/PyMuPDF

A high-performance Python library for data extraction, analysis, conversion and manipulation of PDF and other documents.

9.2K
Active
Python
Document Processing
#pdf#data-extraction#text-processing

blue-yonder/tsfresh

Automatic feature extraction from time series data for data science and machine learning applications.

9.1K
Stable
Jupyter Notebook
Feature Extraction
ETL & Pipelines
Python
#time-series#feature-engineering#data-science

activeloopai/deeplake

Versatile database for AI, supporting storage, querying, versioning, and visualization of any AI data.

9.0K
Active
C++
LLM Frameworks
Vector Databases
PyTorch
#ai#data-storage#vector-database

lazyprogrammer/machine_learning_examples

A collection of machine learning examples and tutorials for data scientists and ML engineers.

8.8K
Stable
Python
Machine Learning
Data Science
Python
#data-science#machine-learning#deep-learning

catboost/catboost

A high-performance gradient boosting library for machine learning tasks on CPUs and GPUs.

8.8K
Active
C++
ML Ops
API Frameworks
Python
#machine-learning#gradient-boosting#decision-trees

mage-ai/mage-ai

mage-ai is a Python-based platform for building, running, and managing data pipelines and integrating/transforming data.

8.7K
Active
Python
ETL & Pipelines
ML Ops
Python
#data-pipelines#data-transformation#data-integration

vaexio/vaex

A high-performance Python library for working with large tabular datasets, offering efficient data manipulation and visualization.

8.5K
Stable
Python
Databases
Caching
Python
#bigdata#data-science#dataframe

jackzhenguo/python-small-examples

A collection of Python code examples and tutorials for data science, machine learning, and web development.

8.1K
Archived
Python
Data Science
Backend Frameworks
Python
#data-science#machine-learning#web-development

py-why/dowhy

DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions.

8.0K
Active
Python
Causal Inference
Databases
Python3
#causal-inference#causal-models#bayesian-networks

mrdbourke/machine-learning-roadmap

A comprehensive roadmap for learning essential machine learning concepts and tools.

7.8K
Archived
Machine Learning Ops
Tutorials & Courses
#machine-learning#data-science#deep-learning

alteryx/featuretools

An open-source Python library for automated feature engineering in machine learning.

7.6K
Active
Python
Automated Machine Learning
ETL & Pipelines
Python
#automated-feature-engineering#machine-learning#data-science

h2oai/h2o-3

An open-source, distributed machine learning platform with support for various algorithms and autoML.

7.5K
Active
Jupyter Notebook
ML Ops
Databases
#machine-learning#automl#distributed

firmai/industry-machine-learning

A curated collection of practical machine learning and data science notebooks and libraries across different industries.

7.4K
Archived
Jupyter Notebook
Machine Learning Ops
Databases
Jupyter Notebook
#data-science#machine-learning#jupyter-notebook

growthbook/growthbook

Open-source feature flagging and A/B testing platform for experimentation, data analysis, and remote config.

7.4K
Active
TypeScript
Feature Flags
Analytics & Tracking
React
#ab-testing#feature-flags#data-analysis

Visualize-ML/Book3_Elements-of-Mathematics

This GitHub repository is a book on the fundamentals of mathematics, covering topics from basic arithmetic to machine learning, written for developers interested in AI tools and applications.

7.4K
Stable
Jupyter Notebook
LLM Frameworks
Books & Guides
Jupyter Notebook
#machine-learning#mathematics#data-science

python-visualization/folium

A Python library for creating interactive maps and visualizations using the Leaflet.js JavaScript library.

7.3K
Active
Python
Charts & Visualization
Data Visualization
Python
#data-visualization#interactive-maps#geographic-information-systems

evidentlyai/evidently

Evidently is an open-source ML and LLM observability framework to evaluate, test, and monitor AI-powered systems.

7.3K
Active
Jupyter Notebook
MLOps
Data Validation
Jupyter Notebook
#data-quality#data-validation#model-monitoring

rasbt/python-machine-learning-book-2nd-edition

A comprehensive code repository for the popular 'Python Machine Learning (2nd edition)' book, covering data science, machine learning, and deep learning topics.

7.2K
Archived
Jupyter Notebook
Machine Learning
Databases
Python
#machine-learning#data-science#deep-learning

scikit-learn-contrib/imbalanced-learn

A Python package to tackle the curse of imbalanced datasets in machine learning

7.1K
Stable
Python
Python
#machine-learning#imbalanced-datasets#python-package

jwilber/roughViz

A reusable JavaScript library for creating sketchy/hand-drawn styled charts in the browser.

7.1K
Archived
JavaScript
Charts & Visualization
Data Visualization
D3.js
#data-visualization#charting#sketchy
1...46...19

Stay in the loop

Get weekly updates on trending AI coding tools and projects.