Category
Showing 251-300 of 897 trending projects
A JavaScript statistical library that provides a wide range of statistical functions for data analysis.
Tonbo is an embedded database for serverless and edge runtimes, optimized for offline-first and big data use cases.
MySQL binlog incremental subscription and consumption component
A Python helper library for enhancing Jupyter Notebooks with data visualization and analysis capabilities.
An Awesome List of open-source data engineering projects for developers.
PaxosStore is a high-performance, distributed database solution built for large-scale applications.
Fast, accurate, and scalable probabilistic data linkage with support for multiple SQL backends.
Poisson Surface Reconstruction is a C++ library for reconstructing surfaces from point cloud data.
Dask is a Python library for parallel computing and distributed data processing, providing a scalable alternative to NumPy and Pandas.
A definition and DDLs for the OMOP Common Data Model (CDM), a data model for healthcare data.
A fast and flexible R package for reading flat files (CSV, TSV, fixed-width) into R data frames.
An educational relational database management system (RDBMS) implementation in C++.
A Python library for fast, customizable, and interactive data profiling and exploratory data analysis.
Python scripts for extracting, transforming and loading Ethereum blockchain data into Google BigQuery.
A collection of solutions to Chinese data competitions, primarily using Python.
A curated list of resources for time series forecasting, including papers, code, and other materials.
A modular quantitative trading framework for algorithmic trading, backtesting, and financial analysis.
A framework-agnostic, datastore-agnostic JavaScript ORM built for ease of use and peace of mind.
An open-source COVID-19 dashboard powered by the fastpages framework, featuring data visualizations.
Redisson is a Java client for Redis and Valkey with distributed objects and services
Distributed SQL database middleware for sharding, scalability, and security
cryo is a Rust library for extracting blockchain data to parquet, CSV, JSON, or Python dataframes.
A Python library with most common stock market technical indicators, making it easy to implement quantitative finance and algorithmic trading.
First open-source data discovery and observability platform for data practitioners.
Comprehensive roadmap for data engineering and AI development in Python
LiteDB is a lightweight, embedded NoSQL document database for .NET applications that can be used in a single data file.
A Chinese translation of a popular book on using Python for data analysis with libraries like pandas and numpy.
A free, interactive SQL learning platform with an online SQL editor, real-time query results, and syntax highlighting.
A Python tool for automatically scraping data on China's statutory holidays from government announcements.
An end-to-end data engineering project example showcasing tools and technologies for building data pipelines.
Fastest open-source data pipeline tool for replicating databases to data lakes in Apache Iceberg format.
Draco is a C++ library for compressing and decompressing 3D geometric meshes and point clouds.
A database migration and schema management tool for PHP developers, supporting multiple database engines.
A collection of notebooks covering quantitative finance and numerical methods in Python.
A high-performance, concurrent, embedded key-value database written in Rust for vibe coders.
A Python library for retrieving administrative division codes for China's GB/T 2260 standard.
A tutorial and implementation of a disease-centered medical knowledge graph and QA system.
A Java ORM SQL query builder that supports popular databases like ClickHouse, Impala, MySQL, and Presto.
Lakekeeper is an open-source, secure, and fast Apache Iceberg REST Catalog written in Rust for data lakehouse governance.
A basic document (NoSQL) database implementation in Go, suitable for small-scale projects.
A database solution that provides better analytics on top of MongoDB and makes it easier to migrate from MongoDB to SQL.
A Python library for calculating customer lifetime value metrics and cohort analysis.
An ordered map implementation in Go with amortized O(1) performance for common operations.
A Python library to access historical market data from the Binance cryptocurrency exchange.
A collection of SQL queries to analyze social media datasets.
A concise guide to the MongoDB NoSQL database for developers.
An end-to-end data pipeline for building a data lake, data warehouse, and analytics platform from GoodReads data.
A powerful suite of sparse matrix algorithms and libraries for scientific and numerical computing.
A Python package for easy access to financial market data in China for quantitative finance and FinTech applications.
Get weekly updates on trending AI coding tools and projects.