Showing 21-40 of 140 projects
A comprehensive collection of resources and learning materials for big data technologies like Flink, Spark, Hadoop, and Hive.
A curated collection of design inspiration and UI components for developers to add delight to their products.
A simple, expressive Java web framework for building API servers and web applications.
A Python library for parsing and transpiling SQL queries across various databases and engines.
mage-ai is a Python-based platform for building, running, and managing data pipelines and integrating/transforming data.
An open-source data lakehouse framework that enables building data pipelines with leading big data compute engines.
A comprehensive set of translations for Laravel and related frameworks, enabling localization of web applications.
An open-source, distributed machine learning platform with support for various algorithms and autoML.
Alluxio is an open-source data orchestration platform for analytics and machine learning workloads in the cloud.
A flexible and powerful parameter server for large-scale machine learning models and distributed training.
Zeppelin is a web-based notebook that enables data-driven, interactive data analytics and collaborative documents.
This repository contains notes on the design and implementation of the Apache Spark distributed computing framework.
SynapseML is a simple and distributed machine learning library for building and deploying AI models at scale.
lakeFS is a Git-like version control system for data lakes, enabling data engineers to manage data versioning and data quality.
An open-source cloud-native AI platform for ML/DL workflows, model serving, and distributed training.
An open-source threat hunting platform built on the ELK stack for security researchers and analysts.
Example code from the Learning Spark book, a resource for developers learning Spark.
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters for distributed machine learning.
A high-performance compressed bitset library for Java used in Apache Spark, Netflix Atlas, and others.
Deequ is a Scala library for defining "unit tests for data" to measure data quality in large datasets.
Get weekly updates on trending AI coding tools and projects.