Explore Projects

Discover 20 open source projects

Active filters (1):
Search: icebergร—
Clear all

Showing 1-20 of 20 projects

apache/doris

Apache Doris is a high-performance, unified analytics database for real-time data processing.

15.1K
Active
Java
Databases
Spark
#database#olap#real-time

trinodb/trino

Trino is a distributed SQL query engine for big data, allowing fast, scalable, and cost-effective analytics.

12.6K
Active
Java
Databases
#big-data#analytics#data-science

StarRocks/starrocks

A high-performance open source query engine for sub-second analytics on data lakehouse.

11.4K
Active
Java
Databases
#analytics#big-data#database

risingwavelabs/risingwave

An open-source, Rust-based event streaming platform for real-time data processing and analytics.

8.8K
Active
Rust
API Frameworks
Databases
Rust
#event-streaming#real-time#data-processing

apache/iceberg

Apache Iceberg is an open-source table format for large analytic datasets, providing a versioned and scalable data lake architecture.

8.6K
Active
Java
Databases
API Frameworks
Apache
#data-lake#versioning#scalable

Eventual-Inc/Daft

High-performance data engine for AI and multimodal workloads, processing images, audio, video, and structured data at scale.

5.3K
Active
Rust
ML Ops
ETL & Pipelines
Rust
#ai-engineering#data-engineering#distributed

cocopon/iceberg.vim

A dark, bluish color scheme for Vim and Neovim, popular among developers and suitable for 'vibe coders'.

2.4K
Stable
Vim Script
IDE Extensions
UI Component Libraries
Vim
#color-scheme#dark-theme#airline

timeplus-io/proton

Fast, single-binary C++ SQL ETL pipeline for stream processing, observability, analytics, and AI/ML.

2.2K
Active
C++
ETL & Pipelines
API Frameworks
#sql#etl#stream-processing

Mooncake-Labs/pg_mooncake

A Rust-based library that provides real-time analytics on Postgres tables, supporting features like columnstore, delta-lake, and Iceberg.

1.9K
Stable
Rust
API Frameworks
Databases
#analytics#columnstore#delta-lake

apache/polaris

Apache Polaris is an open-source catalog for Apache Iceberg, a high-performance table format for data lakes.

1.9K
Active
Java
API Frameworks
Databases
Apache
#apache#iceberg#data-catalog

tansu-io/tansu

Apache Kafka-compatible broker with support for S3, PostgreSQL, SQLite, Apache Iceberg, and Delta Lake.

1.6K
Active
Rust
API Frameworks
Databases
#apache-kafka#s3#postgresql

aws-samples/custom-lens-wa-hub

Provides a JSON template to customize AWS Well-Architected reviews using Custom Lenses.

1.5K
Stable
API Frameworks
Containerization
#aws#well-architected#custom-lens

Snowflake-Labs/pg_lake

Postgres with Iceberg and data lake access for developers

1.4K
Active
C
MCP Servers
React
#data-lake#postgres#iceberg

projectnessie/nessie

Nessie is a transactional data catalog for data lakes that provides Git-like semantics and functionality.

1.4K
Active
Java
Databases
API Frameworks
#data-catalog#data-lakes#git-semantics

datazip-inc/olake

Fastest open-source data pipeline tool for replicating databases to data lakes in Apache Iceberg format.

1.3K
Active
Go
ETL & Pipelines
Realtime
#cdc#data-pipeline#elt

apache/iceberg-rust

A Rust implementation of the Apache Iceberg data lake table format.

1.2K
Active
Rust
API Frameworks
Databases
#apache#hacktoberfest#iceberg

lakekeeper/lakekeeper

Lakekeeper is an open-source, secure, and fast Apache Iceberg REST Catalog written in Rust for data lakehouse governance.

1.2K
Active
Rust
Databases
API Frameworks
#catalog#data-lake#iceberg

apache/incubator-xtable

Apache XTable is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.

1.2K
Active
Java
ETL & Pipelines
#interoperability#lakehouse#data-processing

apache/amoro

Apache Amoro is an open-source Lakehouse management system built on big data formats like Flink, Hudi, and Iceberg.

1.1K
Active
Java
Databases
ETL & Pipelines
Flink
#big-data#data-lake#lakehouse

Mrkuhuo/data-warehouse-learning

Open-source data warehouse learning project with examples and code for building real-time and offline data pipelines.

1.1K
Stable
Java
ETL & Pipelines
API Frameworks
Flink
#data-engineering#etl#pipelines

Stay in the loop

Get weekly updates on trending AI coding tools and projects.