Showing 1-16 of 16 projects
Apache Airflow for workflow orchestration
Data integration platform for ELT pipelines from APIs, databases & files to databases, warehouses & lakes
Apache Doris is a high-performance, unified analytics database for real-time data processing.
dbt enables data analysts and engineers to transform data using software engineering practices.
A high-performance, distributed data integration tool for batch, streaming, and CDC use cases.
mage-ai is a Python-based platform for building, running, and managing data pipelines and integrating/transforming data.
Flink CDC is a streaming data integration tool that enables real-time data pipelines and change data capture.
Data pipelines for cloud config and security data, enabling CSPM, FinOps, and vulnerability management solutions.
An open-source Python library that simplifies the process of loading data into data lakes and warehouses.
Rudder Server is a privacy-focused, Segment-alternative customer data platform written in Go and React.
Maestro is Netflix's workflow orchestrator for building data pipelines and batch processing workflows.
A system for agentic LLM-powered data processing and ETL workflows for unstructured data analysis.
Scalable and efficient data transformation framework with backwards compatibility for dbt.
Meltano is a declarative, code-first data integration engine for building and scaling data and ML-powered products.
Open-source BI platform for engineers to explore and model large-scale data pipelines.
Fastest open-source data pipeline tool for replicating databases to data lakes in Apache Iceberg format.
Get weekly updates on trending AI coding tools and projects.