Trending Projects

Discover the fastest growing open source projects

Showing 301-350 of 897 trending projects

#301

duckdb/duckdb-wasm

WebAssembly version of the DuckDB analytical database, enabling fast in-browser analytics and SQL queries.

+303

+18.6%

1.9K

total stars

C++

#302

holistics/dbml

A database modeling language (DBML) that helps define and document database structures.

+302

+9.3%

3.5K

total stars

JavaScript

#303

timeplus-io/proton

Fast, single-binary C++ SQL ETL pipeline for stream processing, observability, analytics, and AI/ML.

+301

+16.2%

2.2K

total stars

C++

#304

apache/couchdb

An open-source, scalable, and fault-tolerant NoSQL database with a focus on reliability and offline-first design.

+300

+4.6%

6.8K

total stars

Erlang

#305

NateScarlet/holiday-cn

A Python tool for automatically scraping data on China's statutory holidays from government announcements.

+300

+19.4%

1.8K

total stars

Python

#306

zhisheng17/flink-learning

This is a comprehensive learning resource for the Flink stream processing framework, covering concepts, principles, and real-world use cases.

+299

+2.0%

15.1K

total stars

Java

#307

moj-analytical-services/splink

Fast, accurate, and scalable probabilistic data linkage with support for multiple SQL backends.

+299

+17.6%

2.0K

total stars

Python

#308

arangodb/arangodb

ArangoDB is a multi-model database supporting documents, graphs, and key-values for high-performance applications.

+294

+2.1%

14.1K

total stars

C++

#309

matplotlib/mplfinance

A Python library for financial data visualization using Matplotlib, focused on candlestick and OHLC charts.

+291

+7.2%

4.3K

total stars

Python

#310

bukosabino/ta

Technical Analysis Library using Pandas and Numpy for financial data analysis and trading strategies.

+290

+6.3%

4.9K

total stars

Jupyter Notebook

#311

WeBankFinTech/DataSphereStudio

DataSphereStudio is a one-stop data application development and management portal covering data exchange, analysis, and visualization.

+290

+9.8%

3.3K

total stars

Java

#312

allegro/bigcache

Efficient in-memory cache in Go for storing and retrieving large amounts of data.

+286

+3.7%

8.1K

total stars

#313

submato/xhscrawl

A web scraping tool for collecting data from Xiaohongshu, Bilibili, and other Chinese social platforms.

+286

+29.6%

1.3K

total stars

#314

hugo2046/QuantsPlaybook

A quantitative research and stock analysis platform for finance professionals.

+284

+6.7%

4.5K

total stars

Jupyter Notebook

#315

sacridini/Awesome-Geospatial

A comprehensive collection of geospatial tools and resources for data analysis, machine learning, and spatial applications.

+283

+6.3%

4.8K

total stars

#316

rougier/scientific-visualization-book

An open-access book on scientific visualization using Python and Matplotlib for data-driven developers

+282

+2.6%

11.2K

total stars

Python

#317

biopython/biopython

Biopython is a set of Python modules that provide a wide range of functionality for bioinformatics, including DNA/RNA/protein sequence analysis, phylogenetics, and more.

+282

+6.1%

4.9K

total stars

Python

#318

apache/flink-cdc

Flink CDC is a streaming data integration tool that enables real-time data pipelines and change data capture.

+275

+4.5%

6.4K

total stars

Java

#319

supabase/etl

A real-time Postgres data replication and streaming library built in Rust for building CDC pipelines.

+274

+14.3%

2.2K

total stars

Rust

#320

collabH/bigdata-growth

A comprehensive repository covering big data knowledge, including data warehouse modeling, real-time computing, Hadoop, Spark, and more.

+274

+18.8%

1.7K

total stars

Shell

#321

bruin-data/bruin

A data platform that enables building data pipelines with SQL, Python, and ingesting from various sources.

+274

+23.6%

1.4K

total stars

#322

alexkay/spek

An acoustic spectrum analyzer library written in C++ for audio analysis and visualization.

+273

+9.3%

3.2K

total stars

C++

#323

MIT-LCP/mimic-code

Open-source repository for sharing code related to the MIMIC family of critical care databases.

+273

+9.5%

3.1K

total stars

Jupyter Notebook

#324

apache/pinot

Apache Pinot is a realtime distributed OLAP datastore for fast querying of large datasets.

+272

+4.7%

6.0K

total stars

Java

#325

dingodb/dingo

A high-performance, MySQL-compatible vector database that supports structured and unstructured data for AI-driven applications.

+268

+18.8%

1.7K

total stars

Java

#326

kblin/ncbi-genome-download

Scripts to download genomes from the NCBI FTP servers for bioinformatics and genomics research.

+268

+33.7%

1.1K

total stars

Python

#327

camelot-dev/camelot

A Python library for extracting tabular data from PDF files, useful for data processing and analysis.

+267

+8.0%

3.6K

total stars

Python

#328

HouzuoGuo/tiedot

A basic document (NoSQL) database implementation in Go, suitable for small-scale projects.

+267

+10.8%

2.7K

total stars

#329

fivethirtyeight/data

A data repository for the data journalism site FiveThirtyEight, containing data and code behind their articles and graphics.

+266

+1.6%

17.3K

total stars

Jupyter Notebook

#330

apache/druid

Apache Druid is a high-performance real-time analytics database for vibe coders working with data-intensive applications.

+266

+1.9%

14.0K

total stars

Java

#331

cozodb/cozo

A transactional, relational-graph-vector database that uses Datalog for query, designed for AI and ML use cases.

+263

+7.2%

3.9K

total stars

Rust

#332

soedinglab/MMseqs2

MMseqs2 is an ultra-fast and sensitive bioinformatics tool for sequence search and clustering.

+261

+15.1%

2.0K

total stars

#333

has2k1/plotnine

A grammar of graphics library for creating highly customizable and publication-quality plots in Python.

+260

+6.1%

4.5K

total stars

Python

#334

frectonz/sql-studio

A SQL database explorer supporting multiple database engines like SQLite, PostgreSQL, and MySQL.

+260

+8.1%

3.5K

total stars

Rust

#335

eddwebster/football_analytics

A collection of football analytics projects, data, and analysis by Edd Webster (@eddwebster).

+259

+11.6%

2.5K

total stars

Jupyter Notebook

#336

meltano/meltano

Meltano is a declarative, code-first data integration engine for building and scaling data and ML-powered products.

+259

+12.2%

2.4K

total stars

Python

#337

dineug/erd-editor

An open-source, TypeScript-based Entity-Relationship Diagram (ERD) editor for developers working with databases.

+259

+19.4%

1.6K

total stars

TypeScript

#338

avhz/RustQuant

A Rust library for quantitative finance, including tools for machine learning, option pricing, and trading.

+258

+18.4%

1.7K

total stars

Rust

#339

koaning/drawdata

A Python library that allows developers to easily draw datasets within their notebooks.

+258

+18.7%

1.6K

total stars

JavaScript

#340

modin-project/modin

Modin: Scalable Pandas workflows with a single line of code change, enabling distributed data processing.

+257

+2.5%

10.4K

total stars

Python

#341

armink/FlashDB

An ultra-lightweight database that supports key-value and time series data for embedded and IoT applications.

+257

+11.8%

2.4K

total stars

#342

GreenmaskIO/greenmask

A Go-based tool for database anonymization and synthetic data generation to help with security, QA, and data masking.

+257

+18.9%

1.6K

total stars

#343

JoinQuant/jqdatasdk

A Python package for easy access to financial market data in China for quantitative finance and FinTech applications.

+256

+26.1%

1.2K

total stars

Python

#344

alandefreitas/matplotplusplus

Matplot++: A C++ graphics library for creating high-quality data visualizations and scientific plots.

+255

+5.6%

4.8K

total stars

C++

#345

Visualize-ML/Book6_First-Course-in-Data-Science

A book on data science, covering topics from basic math to machine learning using Python and Jupyter Notebooks.

+255

+10.8%

2.6K

total stars

Jupyter Notebook

#346

DrTimothyAldenDavis/SuiteSparse

A powerful suite of sparse matrix algorithms and libraries for scientific and numerical computing.

+253

+21.0%

1.5K

total stars

#347

youngwookim/awesome-hadoop

A curated list of resources for the Hadoop ecosystem, not a developer discovery platform focused on vibe coders.

+253

+29.4%

1.1K

total stars

#348

ron-rs/ron

A Rust library for serializing and deserializing data in the Rusty Object Notation (RON) format.

+252

+7.0%

3.9K

total stars

Rust

#349

percona/percona-toolkit

Percona Toolkit is a collection of advanced open source database tools for MySQL, MongoDB, and PostgreSQL.

+252

+20.9%

1.5K

total stars

Perl

#350

gedeck/practical-statistics-for-data-scientists

This is a code repository for a book on practical statistics for data scientists, not a developer discovery platform.

+249

+8.3%

3.2K

total stars

Jupyter Notebook

1...68...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.