Showing 1-10 of 10 projects
Apache Avro is a data serialization system for efficient storage and transmission of structured data.
Confluent Schema Registry for Kafka, a central repository for managing and storing Avro, JSON, and Protobuf schemas.
A C++20 library for fast serialization, deserialization and validation using reflection, supporting multiple data formats.
A Python library for extracting schema, statistics, and entities from datasets, useful for data profiling and privacy analysis.
This GitHub repository contains over 2,000 data engineering interview questions to help developers prepare.
An Avro serialization library for JavaScript and TypeScript, used for efficient binary data encoding and schema evolution.
pmacct is a multi-purpose network monitoring tool for passive data collection and analysis
Goavro is a Go library for encoding and decoding Avro data, a binary serialization format.
ADAM is a genomics analysis platform with specialized file formats built using Apache Spark and Apache Parquet.
Command line tool for managing Apache Kafka, a popular distributed streaming platform.
Get weekly updates on trending AI coding tools and projects.