Showing 1-8 of 8 projects
lakeFS is a Git-like version control system for data lakes, enabling data engineers to manage data versioning and data quality.
An open-source Python library that simplifies the process of loading data into data lakes and warehouses.
A collection of Udacity data engineering projects showcasing various tools and technologies.
Distributed high-performance data integration engine for batch, streaming, and incremental scenarios.
An end-to-end data pipeline for building a data lake, data warehouse, and analytics platform from GoodReads data.
Lakekeeper is an open-source, secure, and fast Apache Iceberg REST Catalog written in Rust for data lakehouse governance.
Apache Amoro is an open-source Lakehouse management system built on big data formats like Flink, Hudi, and Iceberg.
Kylo is an enterprise-grade data lake management platform built on big data technologies like Spark and Hadoop.
Get weekly updates on trending AI coding tools and projects.