Explore Projects

Discover 56 open source projects

Active filters (1):
Search: hadoopร—
Clear all

Showing 41-56 of 56 projects

nathanmarz/cascalog

Data processing on Hadoop without the hassle, written in Clojure.

1.4K
Archived
Clojure
API Frameworks
Databases
#hadoop#data-processing#clojure

linkedin/dr-elephant

Dr. Elephant is a performance monitoring and tuning tool for Apache Hadoop and Apache Spark.

1.4K
Archived
Java
API Frameworks
#performance-monitoring#apache-hadoop#apache-spark

DTStack/Taier

A big data development platform for submission, scheduling, operation and maintenance, and indicator information display.

1.3K
Archived
Java
API Frameworks
ETL & Pipelines
Flink
#big-data#data-pipeline#task-scheduling

apache/impala

Apache Impala is a high-performance, open-source, SQL query engine that runs on Apache Hadoop and Apache Kudu.

1.3K
Active
C++
Databases
API Frameworks
#big-data#sql#hadoop

yahoo/CaffeOnSpark

Distributed deep learning on Hadoop and Spark clusters for vibe coders.

1.3K
Archived
Jupyter Notebook
ML Ops
API Frameworks
#deep-learning#distributed-computing#hadoop

sequenceiq/hadoop-docker

Hadoop docker image for running Hadoop clusters in a containerized environment.

1.2K
Archived
Dockerfile
Containerization
#hadoop#docker#containerization

apache/ozone

Scalable, reliable, distributed storage system optimized for data analytics and object store workloads.

1.2K
Active
Java
Databases
API Frameworks
Java
#big-data#hadoop#kubernetes

HariSekhon/Nagios-Plugins

A comprehensive collection of Nagios plugins for monitoring AWS, Hadoop, Cloud, Kafka, and other popular technologies.

1.1K
Active
Python
Monitoring
CLI Tools
#monitoring#cloud#devops

twitter/elephant-bird

Twitter's collection of LZO and Protocol Buffer-related Hadoop, Pig, Hive, and HBase code.

1.1K
Archived
Java
API Frameworks
Databases
Hadoop
#hadoop#protocol-buffers#hbase

youngwookim/awesome-hadoop

A curated list of resources for the Hadoop ecosystem, not a developer discovery platform focused on vibe coders.

1.1K
Archived
Databases
#hadoop#big-data#data-processing

Teradata/kylo

Kylo is an enterprise-grade data lake management platform built on big data technologies like Spark and Hadoop.

1.1K
Archived
Java
ETL & Pipelines
Realtime
#data-lake#hadoop#spark

oeljeklaus-you/UserActionAnalyzePlatform

A big data platform for analyzing e-commerce user behavior using Hadoop, Spark, and Java.

1.1K
Archived
Java
API Frameworks
Databases
Spark
#big-data#data-analytics#e-commerce

mahmoudparsian/data-algorithms-book

This repository provides a comprehensive guide and implementations for data algorithms using MapReduce, Spark, Java, and Scala.

1.1K
Archived
Java
Databases
ETL & Pipelines
Apache Hadoop
#data-algorithms#mapreduce#spark

josonle/Coding-Now

A collection of study notes, ebooks, and resources on big data, machine learning, Linux, and more for developers.

1.0K
Archived
Python
Databases
CLI Tools
#big-data#machine-learning#data-analysis

apache/ranger

Apache Ranger is a data security framework for the Hadoop platform, providing comprehensive access control and auditing capabilities.

1.0K
Active
Java
Authentication
Databases
#apache#authz#security

klbostee/dumbo

Python module that simplifies writing and running Hadoop programs.

1.0K
Archived
Python
API Frameworks
ORMs & Query Builders
#hadoop#big-data#etl

Stay in the loop

Get weekly updates on trending AI coding tools and projects.