Showing 81-100 of 310 projects
Easy-to-use streaming application development framework and operation platform for building ETL pipelines.
Preswald is a WASM packager for Python-based interactive data apps that can be run completely in-browser.
A set of TypeScript-based cloud services and utilities for processing and extracting structured data from various document formats.
Maxwell's daemon, a MySQL-to-JSON Kafka producer for building real-time data pipelines.
A Python tool for extracting plain text from Wikipedia dumps, useful for natural language processing tasks.
A list of resources to learn Data Engineering from scratch
A spreadsheet tool with AI capabilities for data analysis, engineering, and visualization.
Curated list of resources about Apache Airflow, a popular workflow management platform.
A command-line tool for running SQL queries against various data formats like JSON, CSV, Excel, and Parquet.
A Docker-based Apache Airflow platform for building and managing data pipelines and workflows.
Maestro is Netflix's workflow orchestrator for building data pipelines and batch processing workflows.
Camelot is a Python library for extracting tables from PDF files, making it easier for developers to work with PDF data.
Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.
A system for agentic LLM-powered data processing and ETL workflows for unstructured data analysis.
A comprehensive machine learning and data science course with Jupyter Notebook materials.
Ploomber is a fast and versatile tool for building and deploying data pipelines that can be used with a variety of AI and ML tools.
Deequ is a Scala library for defining "unit tests for data" to measure data quality in large datasets.
Flow-based programming framework for building complex JavaScript applications and services.
Blazing-fast data wrangling toolkit for AI and data engineering workflows
A curated list of resources for creating node-based UI editors and visual programming tools.
Get weekly updates on trending AI coding tools and projects.