Showing 241-260 of 310 projects
A comprehensive resource for developers to learn and get started with data engineering using Python.
A Python library for extracting, transforming, and loading tabular data.
Fastest open-source data pipeline tool for replicating databases to data lakes in Apache Iceberg format.
An efficient awk-like language written in Rust for text processing and data manipulation tasks.
A Java-based framework for building agile DataOps pipelines using tools like Flink, DataX, and Chunjun with a web UI.
Archive, search, and analyze your entire email/chat history offline with DuckDB-powered analytics and AI queries.
A big data development platform for submission, scheduling, operation and maintenance, and indicator information display.
Provides pre-built Google Cloud Dataflow templates to simplify data processing tasks on the Google Cloud Platform.
PySpark-Tutorial provides basic algorithms using PySpark for big data analytics and data processing.
Notebooks for financial economics, including analyses of Federal Reserve, GDP, inflation, and more.
A collection of TextFSM templates for parsing network device show commands, useful for network automation.
A collection of 101 real-world web scraping exercises in Python 3 for data journalists.
An enterprise-grade, API-first LLM workspace for unstructured document processing, with features like data extraction, redaction, and prompt engineering.
A scalable, SQL-based streaming analytics platform from Uber, built on top of Apache Flink.
A Python library for financial analysis and data scraping from the Finviz platform.
An open-source Python framework for processing Earth observation data using machine learning.
The Data Change Processing platform, a C# library for building CDC (change data capture) and change detection systems.
A high-performance, open-source data processing pipeline for ingesting Kafka data and sending it to Elasticsearch.
A port of Great Expectations to dbt test macros for data testing and validation in data engineering workflows.
Blazing fast, multi-cloud data transfer solution for developers looking to move data seamlessly across cloud providers.
Get weekly updates on trending AI coding tools and projects.