Explore Projects

Discover 380 open source projects

Active filters (1):
Search: data-scienceร—
Clear all

Showing 221-240 of 380 projects

unslothai/hyperlearn

High-performance machine learning algorithms with 2-2000x speed and 50% less memory usage for all hardware.

2.4K
Archived
Jupyter Notebook
ML Ops
Machine Learning
Jupyter Notebook
#machine-learning#performance#optimization

scverse/scanpy

Single-cell analysis in Python, scaling to >100M cells.

2.4K
Active
Python
React
#single-cell analysis#bioinformatics#data-science

claimed-framework/claimed

The goal of CLAIMED is to enable low-code/no-code rapid prototyping style programming to seamlessly CI/CD into production.

2.3K
Active
Jupyter Notebook
AI App Builders
BaaS Platforms
React
#low-code#no-code#rapid-prototyping

feature-engine/feature_engine

Open-source Python library for feature engineering and selection, compatible with scikit-learn.

2.2K
Active
Python
Feature Engineering
ORMs & Query Builders
scikit-learn
#feature-engineering#feature-extraction#feature-selection

man-group/ArcticDB

ArcticDB is a high-performance, serverless DataFrame database for the Python data science ecosystem.

2.2K
Active
C++
Databases
Caching
Python
#data-analysis#data-science#dataframe

pretzelai/pretzelai

A modern, open-source notebook replacement built with TypeScript and DuckDB

2.2K
Archived
TypeScript
React
#notebook#open-source#typescript

orico/www.mlcompendium.com

A comprehensive compendium of resources for machine learning and deep learning development.

2.2K
Experimental
LLM Frameworks
Tutorials & Courses
Gitbook
#machine-learning#deep-learning#data-science

cerlymarco/MEDIUM_NoteBook

A repository containing notebooks of posts on Medium, focused on AI, data science, and machine learning topics.

2.1K
Archived
Jupyter Notebook
Notebooks
Tutorials & Courses
#artificial-intelligence#data-science#deep-learning

alexhallam/tv

Tidy Viewer is a cross-platform CLI tool for pretty printing CSV data with customizable column styling.

2.1K
Stable
Rust
CLI Tools
Data Visualization
#cli#csv#data-visualization

Jon-Becker/prediction-market-analysis

Framework for collecting and analyzing prediction market data with comprehensive Polymarket/Kalshi datasets.

2.1K
Active
Python
ETL & Pipelines
Example Projects
Python
#prediction-markets#polymarket#kalshi

Marktechpost/AI-Tutorial-Codes-Included

A collection of Jupyter Notebook codes and tutorials for a variety of AI projects and data science tasks.

2.1K
Active
Jupyter Notebook
Tutorials & Courses
Tutorials & Courses
Jupyter Notebook
#ai#machine-learning#data-science

BlazingDB/blazingsql

A GPU-accelerated SQL engine for Python, built on RAPIDS cuDF, for high-performance data processing and analysis.

2.0K
Archived
C++
GPU Acceleration
Databases
Python
#gpu#sql-engine#data-science

moj-analytical-services/splink

Fast, accurate, and scalable probabilistic data linkage with support for multiple SQL backends.

2.0K
Active
Python
Databases
ETL & Pipelines
Python
#data-matching#data-deduplication#entity-resolution

Azure/azureml-examples

Official Azure Machine Learning examples, tested with GitHub Actions, for data science and ML projects.

2.0K
Active
Jupyter Notebook
ML Ops
API Frameworks
Azure
#azure#azure-machine-learning#data-science

featureform/featureform

The Virtual Feature Store that turns existing data infrastructure into a feature store for machine learning.

2.0K
Experimental
Go
Feature Engineering
Vector Databases
Go
#data-quality#data-science#embeddings

WenjieDu/PyPOTS

A Python toolkit for building machine/deep learning models on partially-observed time series data

2.0K
Active
Python
Machine Learning Ops
Databases
PyTorch
#time-series#anomaly-detection#forecasting

bytewax/bytewax

Bytewax is a Python library for building scalable, fault-tolerant, and low-latency data processing pipelines.

2.0K
Experimental
Python
ETL & Pipelines
API Frameworks
Python
#streaming#data-engineering#data-processing

robertmartin8/MachineLearningStocks

A Python library for making stock predictions using machine learning and historical stock data.

1.9K
Archived
Python
Machine Learning
Databases
Python
#algorithmic-trading#data-science#historical-stock-fundamentals

feathr-ai/feathr

Feathr is a scalable, unified data and AI engineering platform for enterprises, with features like feature engineering, feature governance, and a feature marketplace.

1.9K
Archived
Scala
Feature Flags
MLOps
Apache Spark
#data-engineering#feature-engineering#feature-governance

shervinea/mit-15-003-data-science-tools

Study guides and resources for MIT's 15.003 Data Science Tools course, covering bash, data science, git, SQL, and more.

1.9K
Archived
Tutorials & Courses
Databases
#data-science#git#sql
1...1113...19

Stay in the loop

Get weekly updates on trending AI coding tools and projects.