Showing 41-60 of 63 projects
Open-source reverse ETL tool for data activation and customer data platform integration.
Seal-Report is a database reporting tool and tasks (.Net) for business intelligence, charting, and dashboard creation.
Dozer is a real-time data movement tool that leverages CDC to move data between various sources and sinks.
AWS Glue code samples for building data integration and ETL pipelines on AWS.
Superlinked is a Python framework for building high-performance search & recommendation apps with structured and unstructured data.
An end-to-end data pipeline for building a data lake, data warehouse, and analytics platform from GoodReads data.
Transporter is a powerful ETL tool that allows developers to sync data between various persistence engines.
A fast and versatile ETL tool that can transfer data between RDBMS and NoSQL databases seamlessly
A Python library that syncs data from Postgres to Elasticsearch/OpenSearch, enabling real-time data pipelines.
This repository provides best practices and examples for building ETL (Extract, Transform, Load) pipelines using Apache Airflow.
A visual data preparation tool powered by Python, designed for data analysis and ETL tasks.
Hop is a flexible and extensible open-source data integration platform for building and orchestrating ETL and streaming pipelines.
An open-source platform for building and managing AI-optimized context graphs for AI-powered applications.
A Go daemon that syncs MongoDB to Elasticsearch in real-time for search-powered applications.
A getting started guide to Singer, a data integration framework for ETL and data analysis.
Dex is a powerful data visualization tool that enables data exploration and publishing of web visualizations.
A Java-based framework for building agile DataOps pipelines using tools like Flink, DataX, and Chunjun with a web UI.
An enterprise-grade, API-first LLM workspace for unstructured document processing, with features like data extraction, redaction, and prompt engineering.
A high-performance logical replication extension for PostgreSQL that enables fast, cross-version database replication.
A scalable, distributed ETL framework for building data lake analytics pipelines.
Get weekly updates on trending AI coding tools and projects.