Trending Projects

Discover the fastest growing open source projects

Showing 301-350 of 897 trending projects

#301
duckdb/duckdb-wasm

WebAssembly version of the DuckDB analytical database, enabling fast in-browser analytics and SQL queries.

+303
+18.6%
1.9K
total stars
#302
holistics/dbml

A database modeling language (DBML) that helps define and document database structures.

+302
+9.3%
3.5K
total stars
#303
timeplus-io/proton

Fast, single-binary C++ SQL ETL pipeline for stream processing, observability, analytics, and AI/ML.

+301
+16.2%
2.2K
total stars
#304
apache/couchdb

An open-source, scalable, and fault-tolerant NoSQL database with a focus on reliability and offline-first design.

+300
+4.6%
6.8K
total stars
#305
NateScarlet/holiday-cn

A Python tool for automatically scraping data on China's statutory holidays from government announcements.

+300
+19.4%
1.8K
total stars
#306
zhisheng17/flink-learning

This is a comprehensive learning resource for the Flink stream processing framework, covering concepts, principles, and real-world use cases.

+299
+2.0%
15.1K
total stars
#307
moj-analytical-services/splink

Fast, accurate, and scalable probabilistic data linkage with support for multiple SQL backends.

+299
+17.6%
2.0K
total stars
#308
arangodb/arangodb

ArangoDB is a multi-model database supporting documents, graphs, and key-values for high-performance applications.

+294
+2.1%
14.1K
total stars
#309
matplotlib/mplfinance

A Python library for financial data visualization using Matplotlib, focused on candlestick and OHLC charts.

+291
+7.2%
4.3K
total stars
#310
bukosabino/ta

Technical Analysis Library using Pandas and Numpy for financial data analysis and trading strategies.

+290
+6.3%
4.9K
total stars
#311
WeBankFinTech/DataSphereStudio

DataSphereStudio is a one-stop data application development and management portal covering data exchange, analysis, and visualization.

+290
+9.8%
3.3K
total stars
#312
allegro/bigcache

Efficient in-memory cache in Go for storing and retrieving large amounts of data.

+286
+3.7%
8.1K
total stars
#313
submato/xhscrawl

A web scraping tool for collecting data from Xiaohongshu, Bilibili, and other Chinese social platforms.

+286
+29.6%
1.3K
total stars
#314
hugo2046/QuantsPlaybook

A quantitative research and stock analysis platform for finance professionals.

+284
+6.7%
4.5K
total stars
#315
sacridini/Awesome-Geospatial

A comprehensive collection of geospatial tools and resources for data analysis, machine learning, and spatial applications.

+283
+6.3%
4.8K
total stars
#316
rougier/scientific-visualization-book

An open-access book on scientific visualization using Python and Matplotlib for data-driven developers

+282
+2.6%
11.2K
total stars
#317
biopython/biopython

Biopython is a set of Python modules that provide a wide range of functionality for bioinformatics, including DNA/RNA/protein sequence analysis, phylogenetics, and more.

+282
+6.1%
4.9K
total stars
#318
apache/flink-cdc

Flink CDC is a streaming data integration tool that enables real-time data pipelines and change data capture.

+275
+4.5%
6.4K
total stars
#319
supabase/etl

A real-time Postgres data replication and streaming library built in Rust for building CDC pipelines.

+274
+14.3%
2.2K
total stars
#320
collabH/bigdata-growth

A comprehensive repository covering big data knowledge, including data warehouse modeling, real-time computing, Hadoop, Spark, and more.

+274
+18.8%
1.7K
total stars
#321
bruin-data/bruin

A data platform that enables building data pipelines with SQL, Python, and ingesting from various sources.

+274
+23.6%
1.4K
total stars
#322
alexkay/spek

An acoustic spectrum analyzer library written in C++ for audio analysis and visualization.

+273
+9.3%
3.2K
total stars
#323
MIT-LCP/mimic-code

Open-source repository for sharing code related to the MIMIC family of critical care databases.

+273
+9.5%
3.1K
total stars
#324
apache/pinot

Apache Pinot is a realtime distributed OLAP datastore for fast querying of large datasets.

+272
+4.7%
6.0K
total stars
#325
dingodb/dingo

A high-performance, MySQL-compatible vector database that supports structured and unstructured data for AI-driven applications.

+268
+18.8%
1.7K
total stars
#326
kblin/ncbi-genome-download

Scripts to download genomes from the NCBI FTP servers for bioinformatics and genomics research.

+268
+33.7%
1.1K
total stars
#327
camelot-dev/camelot

A Python library for extracting tabular data from PDF files, useful for data processing and analysis.

+267
+8.0%
3.6K
total stars
#328
HouzuoGuo/tiedot

A basic document (NoSQL) database implementation in Go, suitable for small-scale projects.

+267
+10.8%
2.7K
total stars
#329
fivethirtyeight/data

A data repository for the data journalism site FiveThirtyEight, containing data and code behind their articles and graphics.

+266
+1.6%
17.3K
total stars
#330
apache/druid

Apache Druid is a high-performance real-time analytics database for vibe coders working with data-intensive applications.

+266
+1.9%
14.0K
total stars
#331
cozodb/cozo

A transactional, relational-graph-vector database that uses Datalog for query, designed for AI and ML use cases.

+263
+7.2%
3.9K
total stars
#332
soedinglab/MMseqs2

MMseqs2 is an ultra-fast and sensitive bioinformatics tool for sequence search and clustering.

+261
+15.1%
2.0K
total stars
#333
has2k1/plotnine

A grammar of graphics library for creating highly customizable and publication-quality plots in Python.

+260
+6.1%
4.5K
total stars
#334
frectonz/sql-studio

A SQL database explorer supporting multiple database engines like SQLite, PostgreSQL, and MySQL.

+260
+8.1%
3.5K
total stars
#335
eddwebster/football_analytics

A collection of football analytics projects, data, and analysis by Edd Webster (@eddwebster).

+259
+11.6%
2.5K
total stars
#336
meltano/meltano

Meltano is a declarative, code-first data integration engine for building and scaling data and ML-powered products.

+259
+12.2%
2.4K
total stars
#337
dineug/erd-editor

An open-source, TypeScript-based Entity-Relationship Diagram (ERD) editor for developers working with databases.

+259
+19.4%
1.6K
total stars
#338
avhz/RustQuant

A Rust library for quantitative finance, including tools for machine learning, option pricing, and trading.

+258
+18.4%
1.7K
total stars
#339
koaning/drawdata

A Python library that allows developers to easily draw datasets within their notebooks.

+258
+18.7%
1.6K
total stars
#340
modin-project/modin

Modin: Scalable Pandas workflows with a single line of code change, enabling distributed data processing.

+257
+2.5%
10.4K
total stars
#341
armink/FlashDB

An ultra-lightweight database that supports key-value and time series data for embedded and IoT applications.

+257
+11.8%
2.4K
total stars
#342
GreenmaskIO/greenmask

A Go-based tool for database anonymization and synthetic data generation to help with security, QA, and data masking.

+257
+18.9%
1.6K
total stars
#343
JoinQuant/jqdatasdk

A Python package for easy access to financial market data in China for quantitative finance and FinTech applications.

+256
+26.1%
1.2K
total stars
#344
alandefreitas/matplotplusplus

Matplot++: A C++ graphics library for creating high-quality data visualizations and scientific plots.

+255
+5.6%
4.8K
total stars
#345
Visualize-ML/Book6_First-Course-in-Data-Science

A book on data science, covering topics from basic math to machine learning using Python and Jupyter Notebooks.

+255
+10.8%
2.6K
total stars
#346
DrTimothyAldenDavis/SuiteSparse

A powerful suite of sparse matrix algorithms and libraries for scientific and numerical computing.

+253
+21.0%
1.5K
total stars
#347
youngwookim/awesome-hadoop

A curated list of resources for the Hadoop ecosystem, not a developer discovery platform focused on vibe coders.

+253
+29.4%
1.1K
total stars
#348
ron-rs/ron

A Rust library for serializing and deserializing data in the Rusty Object Notation (RON) format.

+252
+7.0%
3.9K
total stars
#349
percona/percona-toolkit

Percona Toolkit is a collection of advanced open source database tools for MySQL, MongoDB, and PostgreSQL.

+252
+20.9%
1.5K
total stars
#350
gedeck/practical-statistics-for-data-scientists

This is a code repository for a book on practical statistics for data scientists, not a developer discovery platform.

+249
+8.3%
3.2K
total stars
1...68...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.