Showing 41-60 of 310 projects
Kedro is a Python toolkit for building production-ready data science and machine learning pipelines.
PRQL is a modern, powerful, and pipelined SQL replacement for transforming data.
A flexible and standardized cookiecutter template for doing and sharing data science work in Python.
A high-performance, distributed data integration tool for batch, streaming, and CDC use cases.
Automatic feature extraction from time series data for data science and machine learning applications.
A terminal spreadsheet multitool for discovering and arranging data
An open-source, Rust-based event streaming platform for real-time data processing and analytics.
mage-ai is a Python-based platform for building, running, and managing data pipelines and integrating/transforming data.
An open-source data lakehouse framework that enables building data pipelines with leading big data compute engines.
A highly configurable, production-ready stream processing platform for building real-time data pipelines.
Apache Beam is a unified programming model for batch and streaming data processing.
A high-performance Python library for working with large tabular datasets, offering efficient data manipulation and visualization.
Apache DataFusion is a powerful SQL query engine written in Rust, designed for big data processing and analysis.
Pentaho Data Integration (ETL) is a Java-based tool for building data integration and ETL pipelines.
INFO-SPIDER is an open-source web scraping toolkit that helps users retrieve data from various sources like email, e-commerce, and social platforms.
A comprehensive list of libraries, tools, and APIs for web scraping and data processing.
Steampipe is a zero-ETL, SQL-powered platform for live querying cloud APIs and infrastructure.
Fluent Bit is a fast and lightweight log, metrics, and traces processor for Linux, BSD, OSX, and Windows.
An open-source Python library for automated feature engineering in machine learning.
Tabula is a tool for extracting data from PDF files, allowing developers to easily parse and extract tables.
Get weekly updates on trending AI coding tools and projects.