Explore Projects

Discover 16 open source projects

Active filters (1):
Search: eltร—
Clear all

Showing 1-16 of 16 projects

apache/airflow

Apache Airflow for workflow orchestration

44.5K
Active
Python
ETL & Pipelines
Background Jobs
Python
#airflow#data-pipelines#workflow-orchestration

airbytehq/airbyte

Data integration platform for ELT pipelines from APIs, databases & files to databases, warehouses & lakes

20.8K
Active
Python
ETL & Pipelines
#data-integration#elt#etl

apache/doris

Apache Doris is a high-performance, unified analytics database for real-time data processing.

15.1K
Active
Java
Databases
Spark
#database#olap#real-time

dbt-labs/dbt-core

dbt enables data analysts and engineers to transform data using software engineering practices.

12.3K
Active
Python
ETL & Pipelines
Python
#analytics#business-intelligence#data-modeling

apache/seatunnel

A high-performance, distributed data integration tool for batch, streaming, and CDC use cases.

9.1K
Active
Java
ETL & Pipelines
Realtime
#data-integration#batch#streaming

mage-ai/mage-ai

mage-ai is a Python-based platform for building, running, and managing data pipelines and integrating/transforming data.

8.7K
Active
Python
ETL & Pipelines
ML Ops
Python
#data-pipelines#data-transformation#data-integration

apache/flink-cdc

Flink CDC is a streaming data integration tool that enables real-time data pipelines and change data capture.

6.4K
Active
Java
ETL & Pipelines
Realtime
#streaming#cdc#change-data-capture

cloudquery/cloudquery

Data pipelines for cloud config and security data, enabling CSPM, FinOps, and vulnerability management solutions.

6.3K
Active
Go
API Frameworks
ETL & Pipelines
Go
#cloud#security#data-engineering

dlt-hub/dlt

An open-source Python library that simplifies the process of loading data into data lakes and warehouses.

5.0K
Active
Python
ETL & Pipelines
CLI Tools
Python
#data-engineering#data-loading#data-pipelines

rudderlabs/rudder-server

Rudder Server is a privacy-focused, Segment-alternative customer data platform written in Go and React.

4.4K
Active
Go
Customer Data Platform
ETL & Pipelines
React
#customer-data-platform#customer-data-pipeline#data-integration

Netflix/maestro

Maestro is Netflix's workflow orchestrator for building data pipelines and batch processing workflows.

3.7K
Active
Java
ETL & Pipelines
Background Jobs
Java
#data-engineering#batch-processing#workflow-orchestration

ucbepic/docetl

A system for agentic LLM-powered data processing and ETL workflows for unstructured data analysis.

3.7K
Active
Python
Agents & Orchestration
ETL & Pipelines
Python
#agents#data-pipelines#document-processing

TobikoData/sqlmesh

Scalable and efficient data transformation framework with backwards compatibility for dbt.

2.9K
Active
Python
ETL & Pipelines
Databases
Python
#data-engineering#dataops#dbt

meltano/meltano

Meltano is a declarative, code-first data integration engine for building and scaling data and ML-powered products.

2.4K
Active
Python
ETL & Pipelines
API Frameworks
Python
#data-integration#data-pipelines#etl

quarylabs/quary

Open-source BI platform for engineers to explore and model large-scale data pipelines.

2.4K
Active
Rust
ORMs & Query Builders
ETL & Pipelines
Rust
#analytics#big-data#data-modeling

datazip-inc/olake

Fastest open-source data pipeline tool for replicating databases to data lakes in Apache Iceberg format.

1.3K
Active
Go
ETL & Pipelines
Realtime
#cdc#data-pipeline#elt

Stay in the loop

Get weekly updates on trending AI coding tools and projects.