Trending Projects

Discover the fastest growing open source projects

Showing 251-300 of 897 trending projects

#251
jstat/jstat

A JavaScript statistical library that provides a wide range of statistical functions for data analysis.

+172
+10.6%
1.8K
total stars
#252
tonbo-io/tonbo

Tonbo is an embedded database for serverless and edge runtimes, optimized for offline-first and big data use cases.

+172
+12.9%
1.5K
total stars
#253
alibaba/canal

MySQL binlog incremental subscription and consumption component

+171
+0.6%
29.6K
total stars
#254
pixiedust/pixiedust

A Python helper library for enhancing Jupyter Notebooks with data visualization and analysis capabilities.

+171
+19.7%
1.0K
total stars
#255
gunnarmorling/awesome-opensource-data-engineering

An Awesome List of open-source data engineering projects for developers.

+170
+5.9%
3.0K
total stars
#256
Tencent/paxosstore

PaxosStore is a high-performance, distributed database solution built for large-scale applications.

+170
+11.0%
1.7K
total stars
#257
moj-analytical-services/splink

Fast, accurate, and scalable probabilistic data linkage with support for multiple SQL backends.

+168
+9.2%
2.0K
total stars
#258
mkazhdan/PoissonRecon

Poisson Surface Reconstruction is a C++ library for reconstructing surfaces from point cloud data.

+168
+10.3%
1.8K
total stars
#259
dask/dask

Dask is a Python library for parallel computing and distributed data processing, providing a scalable alternative to NumPy and Pandas.

+167
+1.2%
13.8K
total stars
#260
OHDSI/CommonDataModel

A definition and DDLs for the OMOP Common Data Model (CDM), a data model for healthcare data.

+166
+19.4%
1.0K
total stars
#261
tidyverse/readr

A fast and flexible R package for reading flat files (CSV, TSV, fixed-width) into R data frames.

+165
+19.1%
1.0K
total stars
#262
cmu-db/bustub

An educational relational database management system (RDBMS) implementation in C++.

+164
+3.5%
4.9K
total stars
#263
Data-Centric-AI-Community/ydata-profiling

A Python library for fast, customizable, and interactive data profiling and exploratory data analysis.

+163
+1.2%
13.4K
total stars
#264
blockchain-etl/ethereum-etl

Python scripts for extracting, transforming and loading Ethereum blockchain data into Google BigQuery.

+163
+5.5%
3.1K
total stars
#265
geekinglcq/CDCS

A collection of solutions to Chinese data competitions, primarily using Python.

+163
+10.2%
1.8K
total stars
#266
cuge1995/awesome-time-series

A curated list of resources for time series forecasting, including papers, code, and other materials.

+163
+18.5%
1.0K
total stars
#267
zvtvz/zvt

A modular quantitative trading framework for algorithmic trading, backtesting, and financial analysis.

+162
+4.2%
4.0K
total stars
#268
js-data/js-data

A framework-agnostic, datastore-agnostic JavaScript ORM built for ease of use and peace of mind.

+162
+11.1%
1.6K
total stars
#269
github/covid19-dashboard

An open-source COVID-19 dashboard powered by the fastpages framework, featuring data visualizations.

+161
+10.9%
1.6K
total stars
#270
knex/knex

SQL query builder for multiple databases

+160
+0.8%
20.2K
total stars
#271
redisson/redisson

Redisson is a Java client for Redis and Valkey with distributed objects and services

+159
+0.7%
24.3K
total stars
#272
apache/shardingsphere

Distributed SQL database middleware for sharding, scalability, and security

+159
+0.8%
20.7K
total stars
#273
paradigmxyz/cryo

cryo is a Rust library for extracting blockchain data to parquet, CSV, JSON, or Python dataframes.

+159
+11.5%
1.5K
total stars
#274
mpquant/MyTT

A Python library with most common stock market technical indicators, making it easy to implement quantitative finance and algorithmic trading.

+157
+6.4%
2.6K
total stars
#275
opendatadiscovery/odd-platform

First open-source data discovery and observability platform for data practitioners.

+157
+12.8%
1.4K
total stars
#276
lvgalvao/data-engineering-roadmap

Comprehensive roadmap for data engineering and AI development in Python

+157
+16.1%
1.1K
total stars
#277
litedb-org/LiteDB

LiteDB is a lightweight, embedded NoSQL document database for .NET applications that can be used in a single data file.

+156
+1.7%
9.4K
total stars
#278
iamseancheney/python_for_data_analysis_2nd_chinese_version

A Chinese translation of a popular book on using Python for data analysis with libraries like pandas and numpy.

+156
+1.8%
8.8K
total stars
#279
liyupi/sql-mother

A free, interactive SQL learning platform with an online SQL editor, real-time query results, and syntax highlighting.

+156
+4.0%
4.0K
total stars
#280
NateScarlet/holiday-cn

A Python tool for automatically scraping data on China's statutory holidays from government announcements.

+156
+9.3%
1.8K
total stars
#281
damklis/DataEngineeringProject

An end-to-end data engineering project example showcasing tools and technologies for building data pipelines.

+156
+12.7%
1.4K
total stars
#282
datazip-inc/olake

Fastest open-source data pipeline tool for replicating databases to data lakes in Apache Iceberg format.

+156
+13.6%
1.3K
total stars
#283
google/draco

Draco is a C++ library for compressing and decompressing 3D geometric meshes and point clouds.

+154
+2.2%
7.2K
total stars
#284
aimeos/upscheme

A database migration and schema management tool for PHP developers, supporting multiple database engines.

+154
+6.3%
2.6K
total stars
#285
cantaro86/Financial-Models-Numerical-Methods

A collection of notebooks covering quantitative finance and numerical methods in Python.

+153
+2.3%
6.7K
total stars
#286
spacejam/sled

A high-performance, concurrent, embedded key-value database written in Rust for vibe coders.

+152
+1.7%
8.9K
total stars
#287
cn/GB2260

A Python library for retrieving administrative division codes for China's GB/T 2260 standard.

+152
+11.0%
1.5K
total stars
#288
liuhuanyong/QASystemOnMedicalKG

A tutorial and implementation of a disease-centered medical knowledge graph and QA system.

+151
+2.1%
7.2K
total stars
#289
x-ream/sqli

A Java ORM SQL query builder that supports popular databases like ClickHouse, Impala, MySQL, and Presto.

+151
+8.9%
1.9K
total stars
#290
lakekeeper/lakekeeper

Lakekeeper is an open-source, secure, and fast Apache Iceberg REST Catalog written in Rust for data lakehouse governance.

+151
+14.3%
1.2K
total stars
#291
HouzuoGuo/tiedot

A basic document (NoSQL) database implementation in Go, suitable for small-scale projects.

+150
+5.8%
2.7K
total stars
#292
torodb/stampede

A database solution that provides better analytics on top of MongoDB and makes it easier to migrate from MongoDB to SQL.

+149
+9.3%
1.8K
total stars
#293
CamDavidsonPilon/lifetimes

A Python library for calculating customer lifetime value metrics and cohort analysis.

+149
+11.2%
1.5K
total stars
#294
elliotchance/orderedmap

An ordered map implementation in Go with amortized O(1) performance for common operations.

+149
+17.2%
1.0K
total stars
#295
binance/binance-public-data

A Python library to access historical market data from the Binance cryptocurrency exchange.

+147
+7.0%
2.3K
total stars
#296
antontarasenko/smq

A collection of SQL queries to analyze social media datasets.

+146
+10.4%
1.5K
total stars
#297
karlseguin/the-little-mongodb-book

A concise guide to the MongoDB NoSQL database for developers.

+146
+10.8%
1.5K
total stars
#298
san089/goodreads_etl_pipeline

An end-to-end data pipeline for building a data lake, data warehouse, and analytics platform from GoodReads data.

+144
+10.7%
1.5K
total stars
#299
DrTimothyAldenDavis/SuiteSparse

A powerful suite of sparse matrix algorithms and libraries for scientific and numerical computing.

+144
+11.0%
1.5K
total stars
#300
JoinQuant/jqdatasdk

A Python package for easy access to financial market data in China for quantitative finance and FinTech applications.

+144
+13.2%
1.2K
total stars
1...57...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.