Explore Projects

Discover 299 open source projects

Active filters (1):
Search: pipelinesร—
Clear all

Showing 41-60 of 299 projects

IBM/sarama

Sarama is a Go library for Apache Kafka, a distributed streaming platform for building real-time data pipelines.

12.4K
Active
Go
API Frameworks
#kafka#streaming#real-time

earthly/earthly

A simple, fast, and repeatable build framework for creating Docker images and CI/CD pipelines.

12.0K
Stable
Go
Build Tools
#build-system#ci-cd#docker

kubescape/kubescape

Kubescape is an open-source Kubernetes security platform that provides risk analysis, security, compliance, and misconfiguration scanning.

11.2K
Active
Go
CLI Tools
#kubernetes#security#compliance

great-expectations/great_expectations

A Python library that helps ensure data quality and reliability through data profiling and testing.

11.2K
Active
Python
ETL & Pipelines
#data-quality#data-testing#data-profiling

kedro-org/kedro

Kedro is a Python toolkit for building production-ready data science and machine learning pipelines.

10.8K
Active
Python
ETL & Pipelines
Python
#machine-learning#data-engineering#pipeline

PRQL/prql

PRQL is a modern, powerful, and pipelined SQL replacement for transforming data.

10.7K
Active
Rust
ETL & Pipelines
#data-transformation#sql-alternative#pipeline

cumulo-autumn/StreamDiffusion

StreamDiffusion is a Python library that provides a pipeline-level solution for real-time interactive generation.

10.7K
Archived
Python
LLM Frameworks
#streaming#real-time#interactive-generation

oramasearch/orama

A full-text and vector search engine for developers, with support for typo-tolerance and hybrid search in under 2kb.

10.2K
Active
TypeScript
Search-as-a-Service
TypeScript
#search#vector-search#text-search

EpistasisLab/tpot

A Python Automated Machine Learning tool that optimizes ML pipelines using genetic programming.

10.0K
Stable
Jupyter Notebook
AutoML
scikit-learn
#automated-machine-learning#hyperparameter-optimization#model-selection

bigscience-workshop/petals

A distributed system for running large language models (LLMs) on personal devices, enabling faster fine-tuning and inference.

10.0K
Archived
Python
LLM Frameworks
PyTorch
#llm#distributed-computing#fine-tuning

projectdiscovery/httpx

Fast and multi-purpose HTTP toolkit for running multiple probes using the retryablehttp library.

9.6K
Active
Go
Express
#http#toolkit#multi-purpose

iam-veeramalla/Jenkins-Zero-To-Hero

This repository helps developers set up a full CI/CD pipeline with Jenkins, Docker, Kubernetes, and Argo CD for deployment.

9.4K
Experimental
Python
CI/CD
Jenkins
#cicd#docker#kubernetes

tektoncd/pipeline

A cloud-native Pipeline resource for building and deploying applications on Kubernetes.

8.9K
Active
Go
API Frameworks
Containerization
#kubernetes#pipeline#ci-cd

risingwavelabs/risingwave

An open-source, Rust-based event streaming platform for real-time data processing and analytics.

8.8K
Active
Rust
API Frameworks
Databases
Rust
#event-streaming#real-time#data-processing

mage-ai/mage-ai

mage-ai is a Python-based platform for building, running, and managing data pipelines and integrating/transforming data.

8.7K
Active
Python
ETL & Pipelines
ML Ops
Python
#data-pipelines#data-transformation#data-integration

delta-io/delta

An open-source data lakehouse framework that enables building data pipelines with leading big data compute engines.

8.6K
Active
Scala
ETL & Pipelines
API Frameworks
Spark
#big-data#data-engineering#data-lakehouse

redpanda-data/connect

A highly configurable, production-ready stream processing platform for building real-time data pipelines.

8.6K
Active
Go
Realtime
ETL & Pipelines
Go
#stream-processing#message-queue#data-engineering

squeaky-pl/japronto

Screaming-fast Python HTTP toolkit with pipelining HTTP server based on uvloop and picohttpparser.

8.6K
Archived
C
API Frameworks
Backend Frameworks
#http#toolkit#performance

bentoml/BentoML

BentoML is an easy-to-use framework for building and deploying production-ready machine learning models as APIs.

8.5K
Active
Python
LLM Frameworks
API Clients & Testing
Python
#ai-inference#llm-inference#llm-serving

pentaho/pentaho-kettle

Pentaho Data Integration (ETL) is a Java-based tool for building data integration and ETL pipelines.

8.3K
Active
Java
ETL & Pipelines
#etl#data-integration#pentaho
124...15

Stay in the loop

Get weekly updates on trending AI coding tools and projects.