Explore Projects

Discover 299 open source projects

Active filters (1):
Search: pipelinesร—
Clear all

Showing 21-40 of 299 projects

PrefectHQ/prefect

Workflow orchestration for resilient data pipelines in Python

21.8K
Active
Python
ETL & Pipelines
CI/CD
Python
#data-pipelines#orchestration#workflow-automation

vectordotdev/vector

High-performance observability data pipeline for logs and metrics

21.4K
Active
Rust
Monitoring
ETL & Pipelines
#observability#data-pipeline#logs

airbytehq/airbyte

Data integration platform for ELT pipelines from APIs, databases & files to databases, warehouses & lakes

20.8K
Active
Python
ETL & Pipelines
#data-integration#elt#etl

apache/shardingsphere

Distributed SQL database middleware for sharding, scalability, and security

20.7K
Active
Java
Databases
Java
#distributed-sql#database-sharding#data-encryption

marimo-team/marimo

Reactive Python notebook for data science and AI with git-friendly, deployable, and AI-native features

19.5K
Active
Python
AI Code Editors
IDE Extensions
Python
#reactive-notebook#git-friendly#ai-native

Avaiga/taipy

Taipy is a Python library that helps developers turn data and AI algorithms into production-ready web apps quickly.

19.1K
Active
Python
Agents & Orchestration
Python
#data-engineering#data-ops#data-visualization

spotify/luigi

Luigi is a Python module that helps developers build complex batch job pipelines with dependency management and workflow orchestration.

18.7K
Active
Python
API Frameworks
#pipeline#batch-processing#dependency-management

argoproj/argo-workflows

Argo Workflows is a powerful open-source workflow engine for Kubernetes, enabling complex data processing and machine learning pipelines.

16.5K
Active
Go
ETL & Pipelines
Kubernetes
#kubernetes#pipelines#workflow

dagger/dagger

Dagger is an automation engine that helps developers build, test, and ship any codebase across CI/CD pipelines.

15.5K
Active
Go
CI/CD
Go
#ci-cd#automation#devops

kubeflow/kubeflow

Kubeflow is a machine learning toolkit for building and deploying scalable ML pipelines on Kubernetes.

15.5K
Active
ML Ops
Kubernetes
#machine-learning#kubernetes#tensorflow

getmaxun/maxun

Turn websites into clean data pipelines & structured APIs in minutes with a low-code web scraping tool.

15.2K
Active
TypeScript
API Clients & Testing
React
#web-scraping#automation#no-code

dagster-io/dagster

An open-source data orchestration platform for developing, running, and observing data pipelines and workflows.

15.1K
Active
Python
ETL & Pipelines
Python
#data-engineering#data-orchestration#workflow-automation

llmware-ai/llmware

Unified framework for building enterprise RAG pipelines with small, specialized models

14.9K
Active
Python
Next.js
#LLM Frameworks#RAG Pipelines#Small Specialized Models

elastic/logstash

Logstash is a powerful open-source data processing pipeline that can ingest, transform, and output data from a variety of sources.

14.8K
Active
Java
API Frameworks
Java
#etl#logging#real-time-processing

apache/dolphinscheduler

Apache DolphinScheduler is a modern data orchestration platform for creating high-performance workflows with low-code.

14.2K
Active
Java
Realtime
#workflow-orchestration#job-scheduler#data-pipelines

Unstructured-IO/unstructured

Unstructured is an open-source ETL solution for transforming complex documents into structured data for language models.

14.1K
Active
HTML
Document Processing
#document-processing#data-pipelines#natural-language-processing

memvid/memvid

A Rust-based memory layer for AI agents, enabling serverless, single-file memory with instant retrieval and long-term storage.

13.3K
Active
Rust
Agents & Orchestration
#ai#memory#knowledge-base

CodisLabs/codis

Codis is a proxy-based Redis cluster solution that supports pipelining and dynamic scaling.

13.2K
Archived
Go
API Frameworks
#redis#redis-cluster#nosql

assimp/assimp

An open-source library that simplifies the process of loading 3D file formats into a unified data structure for game development and asset pipelines.

12.8K
Active
C++
Backend Frameworks
C++
#3d-file-formats#asset-pipeline#game-development

debezium/debezium

An open-source framework for change data capture from various databases using Apache Kafka.

12.5K
Active
Java
ETL & Pipelines
Apache Kafka
#change-data-capture#event-streaming#database
13...15

Stay in the loop

Get weekly updates on trending AI coding tools and projects.