Showing 141-160 of 310 projects
Hamilton is an open-source ETL framework that helps data scientists and engineers build modular, testable dataflows with lineage and metadata.
Provides machine learning metrics for distributed, scalable PyTorch applications.
Malloy is an open-source language for describing data relationships and transformations.
Meltano is a declarative, code-first data integration engine for building and scaling data and ML-powered products.
Open-source BI platform for engineers to explore and model large-scale data pipelines.
A fast and memory-efficient library for importing and exporting Excel files in Laravel applications.
Instill Core is an open-source AI infrastructure tool for orchestrating data, models, and pipelines to build AI-powered applications.
ArcticDB is a high-performance, serverless DataFrame database for the Python data science ecosystem.
A real-time Postgres data replication and streaming library built in Rust for building CDC pipelines.
TFX is an end-to-end platform for deploying production ML pipelines.
A lightweight stream processing library for Go developers that supports various streaming platforms.
Fast, single-binary C++ SQL ETL pipeline for stream processing, observability, analytics, and AI/ML.
A Python scraper for extracting data from Facebook Page posts for statistical analysis.
Framework for collecting and analyzing prediction market data with comprehensive Polymarket/Kalshi datasets.
Quantitative trading system with ML analysis, real-time data processing, and risk management
Fast, accurate, and scalable probabilistic data linkage with support for multiple SQL backends.
Deep learning library for Apache Spark that provides high-level APIs and models for building machine learning pipelines.
Apache DataFusion Ballista is a distributed query engine for big data analysis, built with Rust and Arrow.
A JavaScript library that converts CSV and tab-delimited data to web-friendly formats like JSON and XML.
Bytewax is a Python library for building scalable, fault-tolerant, and low-latency data processing pipelines.
Get weekly updates on trending AI coding tools and projects.