Explore Projects

Discover 125 open source projects

Active filters (1):
Search: data-analysis×
Clear all

Showing 21-40 of 125 projects

tangyudi/Ai-Learn

Comprehensive learning roadmap for AI & machine learning, with 200+ practical cases and projects for beginners to experts.

12.7K
Archived
Learning & Education
#machine-learning#deep-learning#computer-vision

OpenRefine/OpenRefine

OpenRefine is a powerful data cleaning and transformation tool that helps developers work with messy data.

11.8K
Active
Java
Data Cleaning & Transformation
Java
#data-analysis#data-wrangling#data-cleaning

statsmodels/statsmodels

Statsmodels is a Python library for statistical modeling and econometrics, providing tools for data analysis and prediction.

11.3K
Active
Python
Data Science
Python
#data-analysis#statistics#econometrics

Yorko/mlcourse.ai

An open-source machine learning course focused on practical algorithms and data analysis in Python.

10.5K
Active
Python
Algorithms
Python
#machine-learning#data-science#algorithms

yzhao062/pyod

A Python library for outlier and anomaly detection, integrating classical and deep learning techniques.

9.7K
Active
Python
Anomaly & Outlier Detection
#anomaly-detection#outlier-detection#unsupervised-learning

rapidsai/cudf

A high-performance GPU DataFrame library for data analysis and machine learning workloads.

9.5K
Active
C++
Databases
Python
#data-analysis#data-science#gpu

gonum/gonum

Gonum is a set of numeric libraries for the Go programming language, providing tools for data analysis, scientific computing, and more.

8.3K
Active
Go
API Frameworks
Databases
Go
#data-analysis#matrix#graph

jeecgboot/jimureport

Open-source BI and reporting tool with powerful AI-driven features for creating data visualizations and dashboards.

7.9K
Active
Java
Data Visualization
Charts & Visualization
#bi#data-analysis#data-visualization

growthbook/growthbook

Open-source feature flagging and A/B testing platform for experimentation, data analysis, and remote config.

7.4K
Active
TypeScript
Feature Flags
Analytics & Tracking
React
#ab-testing#feature-flags#data-analysis

Alluxio/alluxio

Alluxio is an open-source data orchestration platform for analytics and machine learning workloads in the cloud.

7.2K
Experimental
Java
Data Orchestration
ML Ops
Spark
#data-analysis#data-orchestration#memory-speed

scikit-learn-contrib/imbalanced-learn

A Python package to tackle the curse of imbalanced datasets in machine learning

7.1K
Stable
Python
Python
#machine-learning#imbalanced-datasets#python-package

flyteorg/flyte

A flexible workflow orchestration platform that seamlessly integrates data, ML, and analytics stacks.

6.8K
Active
Go
ML Ops
API Frameworks
Go
#workflow-orchestration#data-integration#machine-learning

rhiever/Data-Analysis-and-Machine-Learning-Projects

A collection of data analysis and machine learning projects and resources for developers.

6.6K
Archived
Jupyter Notebook
Data Science
Learning & Education
Jupyter Notebook
#data-analysis#machine-learning#jupyter-notebook

qinwf/awesome-R

A curated list of awesome R packages, frameworks and software for data analysis and data science.

6.4K
Stable
R
Databases
ORMs & Query Builders
#r#rstats#data-analysis

cloudquery/cloudquery

Data pipelines for cloud config and security data, enabling CSPM, FinOps, and vulnerability management solutions.

6.3K
Active
Go
API Frameworks
ETL & Pipelines
Go
#cloud#security#data-engineering

pachyderm/pachyderm

Pachyderm is a data-centric pipeline and data versioning platform for building and scaling data-intensive applications.

6.3K
Experimental
Go
ETL & Pipelines
Containerization
Go
#data-pipelines#data-versioning#distributed-systems

lance-format/lance

An open-source data format for building high-performance multimodal AI applications with fast random access, vector indexing, and data versioning.

6.1K
Active
Rust
LLM Frameworks
Databases
Rust
#data-format#data-versioning#vector-index

datajuicer/data-juicer

A Python library for processing and analyzing data with foundation models and large language models.

6.0K
Active
Python
LLM Frameworks
ETL & Pipelines
Python
#data-processing#data-analysis#foundation-models

airbnb/knowledge-repo

A next-generation curated knowledge sharing platform for data scientists and other technical professionals.

5.5K
Archived
Python
Data Analysis
Data Science
Python
#data-analysis#data-science#knowledge-sharing

WillKoehrsen/Data-Analysis

A data analysis platform built with Python and Jupyter Notebook for vibe coders

5.5K
Archived
Jupyter Notebook
React
#data-analysis#python#jupyter-notebook

Stay in the loop

Get weekly updates on trending AI coding tools and projects.