Explore Projects

Discover 31 open source projects

Active filters (1):
Search: flink×
Clear all

Showing 1-20 of 31 projects

dmlc/xgboost

Distributed gradient boosting library for fast and accurate data science solutions

28.1K
Active
C++
ML Ops
Multi-Purpose
#xgboost#machine-learning#distributed-systems

apache/flink

Apache Flink is a stream processing framework for real-time and batch data processing.

25.8K
Active
Java
ETL & Pipelines
Backend Frameworks
Apache Hadoop
#stream-processing#batch-processing#data-streams

zhisheng17/flink-learning

This is a comprehensive learning resource for the Flink stream processing framework, covering concepts, principles, and real-world use cases.

15.1K
Experimental
Java
Databases
#stream-processing#flink#kafka

wangzhiwubigdata/God-Of-BigData

A comprehensive collection of resources and learning materials for big data technologies like Flink, Spark, Hadoop, and Hive.

10.4K
Archived
Databases
#big-data#hadoop#spark

delta-io/delta

An open-source data lakehouse framework that enables building data pipelines with leading big data compute engines.

8.6K
Active
Scala
ETL & Pipelines
API Frameworks
Spark
#big-data#data-engineering#data-lakehouse

apache/zeppelin

Zeppelin is a web-based notebook that enables data-driven, interactive data analytics and collaborative documents.

6.6K
Active
Java
Databases
API Frameworks
Java
#big-data#database#data-analytics

apache/flink-cdc

Flink CDC is a streaming data integration tool that enables real-time data pipelines and change data capture.

6.4K
Active
Java
ETL & Pipelines
Realtime
#streaming#cdc#change-data-capture

zq2599/blog_demos

This GitHub repository contains a collection of over 600 original articles and source code samples covering Java, Docker, Kubernetes, DevOPS, and more.

4.8K
Active
Java
API Frameworks
Containerization
Spring
#java#docker#kubernetes

water8394/flink-recommandSystem-demo

A real-time product recommendation system built with Flink, Redis, HBase, and Kafka for vibe coders.

4.5K
Archived
Java
Realtime
Caching
Flink
#flink#recommender-system#real-time

DataLinkDC/dinky

Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.

3.7K
Stable
Java
ETL & Pipelines
Databases
Apache Flink
#datalake#datawarehouse#flink

alibaba/Alink

Alink is a machine learning algorithm platform built on Apache Flink, developed by Alibaba's PAI team.

3.6K
Archived
Java
ML Ops
Databases
#machine-learning#flink#data-mining

WeBankFinTech/DataSphereStudio

DataSphereStudio is a one-stop data application development and management portal covering data exchange, analysis, and visualization.

3.3K
Stable
Java
ETL & Pipelines
API Frameworks
Spark
#data-management#data-analysis#data-visualization

lakesoul-io/LakeSoul

LakeSoul is a cloud-native, real-time Lakehouse framework for fast data ingestion and analytics on cloud storage.

3.2K
Active
Java
API Frameworks
Databases
#big-data#lakehouse#streaming

chrislusf/glow

Glow is a distributed computation system written in Go, similar to Hadoop MapReduce, Spark, and Flink.

3.2K
Archived
Go
API Frameworks
Databases
#distributed-computing#big-data#data-processing

apache/paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark.

3.2K
Active
Java
ETL & Pipelines
Realtime
#big-data#data-ingestion#flink

MoRan1607/BigDataGuide

A comprehensive guide to big data, covering various tools and technologies for learning and development.

3.1K
Active
React
#bigdata#machine learning#development

cirosantilli/china-dictatorship

Political activism documentation on Chinese government censorship, human rights, and censorship circumvention techniques.

2.9K
Active
HTML
Resource Collections
Privacy Tools
#censorship-circumvention#china-dictatorship#human-rights

geekyouth/SZT-bigdata

This is a big data analysis system for the Shenzhen metro with support for various data processing tools.

2.4K
Archived
Scala
Databases
API Frameworks
Scala
#big-data#data-analysis#metro

timeplus-io/proton

Fast, single-binary C++ SQL ETL pipeline for stream processing, observability, analytics, and AI/ML.

2.2K
Active
C++
ETL & Pipelines
API Frameworks
#sql#etl#stream-processing

alibaba/SREWorks

A cloud-native DataOps and AIOps platform for building and operating data-intensive applications.

2.0K
Stable
Java
API Frameworks
Containerization
React
#aiops#dataops#devops
2

Stay in the loop

Get weekly updates on trending AI coding tools and projects.