Explore Projects

Discover 13 open source projects

Active filters (1):
Search: daskร—
Clear all

Showing 1-13 of 13 projects

dmlc/xgboost

Distributed gradient boosting library for fast and accurate data science solutions

28.1K
Active
C++
ML Ops
Multi-Purpose
#xgboost#machine-learning#distributed-systems

dask/dask

Dask is a Python library for parallel computing and distributed data processing, providing a scalable alternative to NumPy and Pandas.

13.8K
Active
Python
Databases
Python
#parallel-computing#distributed-data-processing#data-analysis

rapidsai/cudf

A high-performance GPU DataFrame library for data analysis and machine learning workloads.

9.5K
Active
C++
Databases
Python
#data-analysis#data-science#gpu

mars-project/mars

A unified framework for large-scale data computation that scales popular Python data tools like NumPy, Pandas, and Scikit-Learn.

2.7K
Archived
Python
ML Ops
Caching
Dask
#machine-learning#data-processing#scale

fugue-project/fugue

A unified interface for distributed computing on Spark, Dask and Ray without any rewrites.

2.1K
Active
Python
Databases
API Frameworks
Python
#distributed-computing#spark#dask

dask/dask-tutorial

An interactive tutorial for the Dask distributed computing library, focused on data analysis and manipulation.

1.9K
Stable
Jupyter Notebook
Databases
Tutorials & Courses
#data-analysis#distributed-computing#data-manipulation

dask/distributed

A distributed task scheduler for Dask, a popular Python library for parallel and distributed computing.

1.7K
Active
Python
API Frameworks
Databases
Python
#distributed-computing#parallel-processing#task-scheduling

narwhals-dev/narwhals

Lightweight and extensible compatibility layer between popular dataframe libraries like Pandas, Dask, and PySpark.

1.5K
Active
Python
Databases
CLI Tools
Python
#dataframes#compatibility#pandas

hi-primus/optimus

Agile data preparation workflows made easy with popular Python data science libraries.

1.5K
Archived
Python
ETL & Pipelines
API Frameworks
#big-data-cleaning#data-analysis#data-cleaning

holoviz/hvplot

A high-level plotting library for data visualization in Python, built on top of HoloViews.

1.3K
Active
Python
Charts & Visualization
Python
#data-visualization#plotting#pandas

Nixtla/mlforecast

A scalable machine learning library for time series forecasting in Python.

1.2K
Active
Python
ML Ops
Databases
Python
#time-series#forecasting#machine-learning

pytroll/satpy

A Python package for processing earth-observing satellite data with support for common data formats and tools.

1.2K
Active
Python
Databases
ETL & Pipelines
Python
#satellite#weather#climate

itamarst/eliot

Eliot is a Python logging library that provides detailed causality analysis and tracing for complex distributed systems.

1.2K
Active
Python
API Frameworks
Tracing
Twisted
#logging#causality-analysis#tracing

Stay in the loop

Get weekly updates on trending AI coding tools and projects.