Showing 1-18 of 18 projects
MLflow is an open-source platform for building, tracking, and deploying AI/ML models with end-to-end observability and evaluation tools.
SynapseML is a simple and distributed machine learning library for building and deploying AI models at scale.
lakeFS is a Git-like version control system for data lakes, enabling data engineers to manage data versioning and data quality.
Open-source Spark codebase analysis and library for Scala developers working with Apache Spark.
An interactive and reactive data science platform powered by Scala and Apache Spark.
A Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
BigDL is a distributed deep learning library that allows developers to run TensorFlow, Keras and PyTorch models on Apache Spark/Flink and Ray.
Feathr is a scalable, unified data and AI engineering platform for enterprises, with features like feature engineering, feature governance, and a feature marketplace.
A curated list of awesome Apache Spark packages and resources for developers.
A distributed real-time machine learning platform built on Apache Spark and Kafka for large-scale workloads.
This GitHub repository contains SQL data analysis and visualization projects using various tools and databases.
This repository provides an in-depth look at the internals of the popular Apache Spark data processing framework.
An end-to-end data pipeline for building a data lake, data warehouse, and analytics platform from GoodReads data.
This is a book that teaches how to use Apache Spark for lightning-fast data analytics.
A Python library that integrates Scikit-learn into the Apache Spark distributed computing framework.
GraphFrames provides DataFrame-based Graphs for Apache Spark, enabling scalable graph analysis and algorithms.
This repository provides a comprehensive guide and implementations for data algorithms using MapReduce, Spark, Java, and Scala.
Deprecated Scikit-learn integration package for Apache Spark, useful for machine learning on big data.
Get weekly updates on trending AI coding tools and projects.