Explore Projects

Discover 63 open source projects

Active filters (1):
Search: etlร—
Clear all

Showing 41-60 of 63 projects

Multiwoven/multiwoven

Open-source reverse ETL tool for data activation and customer data platform integration.

1.6K
Active
Ruby
API Frameworks
ETL & Pipelines
React
#data-activation#customer-data-platform#reverse-etl

ariacom/Seal-Report

Seal-Report is a database reporting tool and tasks (.Net) for business intelligence, charting, and dashboard creation.

1.6K
Active
C#
React
#reporting-engine#database-reporting#business-intelligence

getdozer/dozer

Dozer is a real-time data movement tool that leverages CDC to move data between various sources and sinks.

1.6K
Archived
Rust
ETL & Pipelines
Realtime
Rust
#realtime#data-movement#etl

aws-samples/aws-glue-samples

AWS Glue code samples for building data integration and ETL pipelines on AWS.

1.5K
Stable
Python
ETL & Pipelines
#aws#glue#etl

superlinked/superlinked

Superlinked is a Python framework for building high-performance search & recommendation apps with structured and unstructured data.

1.5K
Stable
Jupyter Notebook
LLM Frameworks
RAG & Vector
Python
#data-pipeline#embeddings#information-retrieval

san089/goodreads_etl_pipeline

An end-to-end data pipeline for building a data lake, data warehouse, and analytics platform from GoodReads data.

1.5K
Archived
Python
ETL & Pipelines
Background Jobs
Apache Airflow
#data-engineering#etl-pipeline#data-lake

compose/transporter

Transporter is a powerful ETL tool that allows developers to sync data between various persistence engines.

1.4K
Archived
Go
ETL & Pipelines
API Frameworks
Go
#etl#data-sync#persistence-engine

wgzhao/Addax

A fast and versatile ETL tool that can transfer data between RDBMS and NoSQL databases seamlessly

1.4K
Active
Java
ETL & Pipelines
API Frameworks
#etl#database#rdbms

toluaina/pgsync

A Python library that syncs data from Postgres to Elasticsearch/OpenSearch, enabling real-time data pipelines.

1.4K
Active
Python
ETL & Pipelines
Realtime
Python
#change-data-capture#elasticsearch-sync#postgresql

gtoonstra/etl-with-airflow

This repository provides best practices and examples for building ETL (Extract, Transform, Load) pipelines using Apache Airflow.

1.4K
Archived
Shell
ETL & Pipelines
#etl#airflow#data-pipelines

amphi-ai/amphi-etl

A visual data preparation tool powered by Python, designed for data analysis and ETL tasks.

1.4K
Active
TypeScript
ETL & Pipelines
Data Analysis
TypeScript
#data-analysis#data-pipelines#data-transformation

apache/hop

Hop is a flexible and extensible open-source data integration platform for building and orchestrating ETL and streaming pipelines.

1.3K
Active
Java
ETL & Pipelines
ETL & Pipelines
#data-integration#etl#orchestration

trustgraph-ai/trustgraph

An open-source platform for building and managing AI-optimized context graphs for AI-powered applications.

1.3K
Active
Python
LLM Frameworks
Agents & Orchestration
#ai#context-graph#ontology-engineering

rwynn/monstache

A Go daemon that syncs MongoDB to Elasticsearch in real-time for search-powered applications.

1.3K
Stable
Go
Realtime
ETL & Pipelines
#mongodb#elasticsearch#opensearch

singer-io/getting-started

A getting started guide to Singer, a data integration framework for ETL and data analysis.

1.3K
Stable
Makefile
Makefile
#authentication#streaming#real-time

PatMartin/Dex

Dex is a powerful data visualization tool that enables data exploration and publishing of web visualizations.

1.3K
Archived
JavaScript
ETL & Pipelines
Charts & Visualization
#data-analysis#data-visualization#data-mining

datavane/tis

A Java-based framework for building agile DataOps pipelines using tools like Flink, DataX, and Chunjun with a web UI.

1.3K
Active
Java
ETL & Pipelines
API Frameworks
#dataops#etl#flink

Open-Source-Legal/OpenContracts

An enterprise-grade, API-first LLM workspace for unstructured document processing, with features like data extraction, redaction, and prompt engineering.

1.2K
Active
Python
LLM Frameworks
ETL & Pipelines
Python
#llm#prompt-engineering#etl

2ndQuadrant/pglogical

A high-performance logical replication extension for PostgreSQL that enables fast, cross-version database replication.

1.2K
Stable
C
ETL & Pipelines
API Frameworks
#database-replication#etl#logical-decoding

marsupialtail/quokka

A scalable, distributed ETL framework for building data lake analytics pipelines.

1.2K
Archived
Python
ETL & Pipelines
API Frameworks
Python
#data-lake#analytics#distributed

Stay in the loop

Get weekly updates on trending AI coding tools and projects.