Showing 21-40 of 85 projects
Apache Beam is a unified programming model for batch and streaming data processing.
Apache DataFusion is a powerful SQL query engine written in Rust, designed for big data processing and analysis.
An open-source, distributed machine learning platform with support for various algorithms and autoML.
Arkime is an open-source packet capture and network monitoring system for security and network analysis.
An open-source, scalable, and fault-tolerant NoSQL database with a focus on reliability and offline-first design.
Vespa is an AI-powered search and recommendation engine for building data-driven, scalable applications.
An open-source feature store for AI/ML applications
Zeppelin is a web-based notebook that enables data-driven, interactive data analytics and collaborative documents.
Hazelcast is a high-performance, distributed in-memory data platform for real-time insights and stream processing.
Pachyderm is a data-centric pipeline and data versioning platform for building and scaling data-intensive applications.
Apache Hive is a data warehouse software built on top of Apache Hadoop for querying and managing large datasets.
High-performance data engine for AI and multimodal workloads, processing images, audio, video, and structured data at scale.
SynapseML is a simple and distributed machine learning library for building and deploying AI models at scale.
A Python library for building scalable news feeds, activity streams, and notification systems using Cassandra and Redis.
Easily convert large sets of image URLs into a dataset for AI/ML training and experimentation.
CrateDB is a distributed, scalable SQL database for storing and analyzing massive amounts of data in near real-time.
A high-performance Java JSON library for fast serialization and deserialization.
Koalas is a pandas-like API for Apache Spark, enabling data scientists to work with big data using familiar pandas syntax.
LakeSoul is a cloud-native, real-time Lakehouse framework for fast data ingestion and analytics on cloud storage.
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark.
Get weekly updates on trending AI coding tools and projects.