Showing 1-17 of 17 projects
Real-time analytics database for generating data reports
Data integration platform for ELT pipelines from APIs, databases & files to databases, warehouses & lakes
Presto is an open-source distributed SQL query engine for big data, allowing fast analysis of large datasets.
Apache Doris is a high-performance, unified analytics database for real-time data processing.
A high-performance open source query engine for sub-second analytics on data lakehouse.
Unified cloud-native data warehouse platform for analytics, search and AI, built on top of S3 storage.
An open-source data lakehouse framework that enables building data pipelines with leading big data compute engines.
An open-source data format for building high-performance multimodal AI applications with fast random access, vector indexing, and data versioning.
LakeSoul is a cloud-native, real-time Lakehouse framework for fast data ingestion and analytics on cloud storage.
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark.
An open-source data catalog platform for building a high-performance, federated metadata lake.
A Rust-based library that provides real-time analytics on Postgres tables, supporting features like columnstore, delta-lake, and Iceberg.
Apache Fluss is a real-time streaming storage platform built for big data analytics.
Fastest open-source data pipeline tool for replicating databases to data lakes in Apache Iceberg format.
Lakekeeper is an open-source, secure, and fast Apache Iceberg REST Catalog written in Rust for data lakehouse governance.
Apache XTable is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.
Apache Amoro is an open-source Lakehouse management system built on big data formats like Flink, Hudi, and Iceberg.
Get weekly updates on trending AI coding tools and projects.