Showing 1-14 of 14 projects
Conversational data analysis with LLMs using natural language queries on databases, CSVs, and data lakes.
Trino is a distributed SQL query engine for big data, allowing fast, scalable, and cost-effective analytics.
A high-performance open source query engine for sub-second analytics on data lakehouse.
Versatile database for AI, supporting storage, querying, versioning, and visualization of any AI data.
Open-source, cloud-native, unified observability database for metrics, logs and traces, supporting SQL/PromQL/Streaming.
lakeFS is a Git-like version control system for data lakes, enabling data engineers to manage data versioning and data quality.
Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.
LakeSoul is a cloud-native, real-time Lakehouse framework for fast data ingestion and analytics on cloud storage.
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark.
An open-source data catalog platform for building a high-performance, federated metadata lake.
Apache Kafka-compatible broker with support for S3, PostgreSQL, SQLite, Apache Iceberg, and Delta Lake.
LeoFS is a distributed, scalable, and fault-tolerant object storage system for developers working with large data volumes.
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
A collection of open-source Kafka connectors for various data sources and destinations maintained by Lenses.io.
Get weekly updates on trending AI coding tools and projects.