Trending Projects

Discover the fastest growing open source projects

Showing 201-250 of 897 trending projects

#201
thinh-vu/vnstock

A beginner-friendly Python toolkit for financial data extraction, analysis, and automation.

+34
+3.0%
1.2K
total stars
#202
apache/shardingsphere

Distributed SQL database middleware for sharding, scalability, and security

+33
+0.2%
20.7K
total stars
#203
alexkay/spek

An acoustic spectrum analyzer library written in C++ for audio analysis and visualization.

+33
+1.0%
3.2K
total stars
#204
gunnarmorling/awesome-opensource-data-engineering

An Awesome List of open-source data engineering projects for developers.

+33
+1.1%
3.0K
total stars
#205
iamseancheney/python_for_data_analysis_2nd_chinese_version

A Chinese translation of a popular book on using Python for data analysis with libraries like pandas and numpy.

+32
+0.4%
8.8K
total stars
#206
eddwebster/football_analytics

A collection of football analytics projects, data, and analysis by Edd Webster (@eddwebster).

+32
+1.3%
2.5K
total stars
#207
man-group/ArcticDB

ArcticDB is a high-performance, serverless DataFrame database for the Python data science ecosystem.

+32
+1.5%
2.2K
total stars
#208
cube2222/octosql

OctoSQL is a powerful SQL query tool that allows you to join, analyze, and transform data from multiple databases and file formats.

+31
+0.6%
5.2K
total stars
#209
apache/hugegraph

A highly scalable, high-performance graph database that supports over 100 billion data points.

+31
+1.1%
3.0K
total stars
#210
spacejam/sled

A high-performance, concurrent, embedded key-value database written in Rust for vibe coders.

+30
+0.3%
8.9K
total stars
#211
delta-io/delta-rs

A Rust library for interacting with Delta Lake, a data lake storage format, with Python bindings.

+30
+1.0%
3.2K
total stars
#212
apache/hamilton

Hamilton is an open-source ETL framework that helps data scientists and engineers build modular, testable dataflows with lineage and metadata.

+29
+1.2%
2.4K
total stars
#213
skfolio/skfolio

A Python library for portfolio optimization using scikit-learn and convex optimization techniques.

+29
+1.6%
1.9K
total stars
#214
scrollmapper/bible_databases

This GitHub repository provides a collection of Bible versions and cross-reference databases, but it does not appear to be related to the given developer discovery platform focused on vibe coders.

+29
+2.0%
1.5K
total stars
#215
kedro-org/kedro

Kedro is a Python toolkit for building production-ready data science and machine learning pipelines.

+28
+0.3%
10.8K
total stars
#216
oceanbase/oceanbase

A fast, scalable, and distributed database for transactional, analytical, and AI workloads.

+28
+0.3%
10.0K
total stars
#217
TurboWay/bigdata_analyse

This is a Python project for big data analysis, focusing on HQL, SQL, and data processing.

+28
+0.6%
5.0K
total stars
#218
vlcn-io/cr-sqlite

A Rust library that provides multi-writer and CRDT support for SQLite databases.

+28
+0.8%
3.6K
total stars
#219
duckdb/duckdb-wasm

WebAssembly version of the DuckDB analytical database, enabling fast in-browser analytics and SQL queries.

+28
+1.5%
1.9K
total stars
#220
valeriansaliou/sonic

Fast, lightweight search backend alternative to Elasticsearch

+27
+0.1%
21.2K
total stars
#221
biopython/biopython

Biopython is a set of Python modules that provide a wide range of functionality for bioinformatics, including DNA/RNA/protein sequence analysis, phylogenetics, and more.

+27
+0.6%
4.9K
total stars
#222
matplotlib/mplfinance

A Python library for financial data visualization using Matplotlib, focused on candlestick and OHLC charts.

+27
+0.6%
4.3K
total stars
#223
MIT-LCP/mimic-code

Open-source repository for sharing code related to the MIMIC family of critical care databases.

+27
+0.9%
3.1K
total stars
#224
redisson/redisson

Redisson is a Java client for Redis and Valkey with distributed objects and services

+26
+0.1%
24.3K
total stars
#225
dask/dask

Dask is a Python library for parallel computing and distributed data processing, providing a scalable alternative to NumPy and Pandas.

+26
+0.2%
13.8K
total stars
#226
apache/couchdb

An open-source, scalable, and fault-tolerant NoSQL database with a focus on reliability and offline-first design.

+26
+0.4%
6.8K
total stars
#227
cantaro86/Financial-Models-Numerical-Methods

A collection of notebooks covering quantitative finance and numerical methods in Python.

+26
+0.4%
6.7K
total stars
#228
apache/flink-cdc

Flink CDC is a streaming data integration tool that enables real-time data pipelines and change data capture.

+26
+0.4%
6.4K
total stars
#229
dbt-labs/dbt-utils

Utility functions for dbt projects, a popular data transformation tool for data engineers.

+26
+1.6%
1.7K
total stars
#230
seandavi/awesome-single-cell

A curated list of software packages and data resources for single-cell analysis, including RNA-seq and ATAC-seq.

+25
+0.7%
3.7K
total stars
#231
apache/paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark.

+25
+0.8%
3.2K
total stars
#232
armink/FlashDB

An ultra-lightweight database that supports key-value and time series data for embedded and IoT applications.

+25
+1.0%
2.4K
total stars
#233
timeplus-io/proton

Fast, single-binary C++ SQL ETL pipeline for stream processing, observability, analytics, and AI/ML.

+25
+1.2%
2.2K
total stars
#234
materialsproject/pymatgen

A robust Python library for materials analysis and computational materials science.

+25
+1.4%
1.8K
total stars
#235
dicedb/dicedb

DiceDB is an open-source, fast, reactive, in-memory database optimized for modern hardware.

+24
+0.2%
10.7K
total stars
#236
Wisser/Jailer

A Java-based database subsetting and relational data browsing tool for popular databases.

+24
+0.8%
3.1K
total stars
#237
galaxyproject/galaxy

An open-source, community-driven platform for data-intensive scientific analysis and visualization.

+24
+1.4%
1.7K
total stars
#238
liuhuanyong/QASystemOnMedicalKG

A tutorial and implementation of a disease-centered medical knowledge graph and QA system.

+23
+0.3%
7.2K
total stars
#239
grantjenks/python-sortedcontainers

A Python library that provides efficient, Pythonic data structures for sorted lists, dictionaries, and sets.

+23
+0.6%
3.9K
total stars
#240
gedeck/practical-statistics-for-data-scientists

This is a code repository for a book on practical statistics for data scientists, not a developer discovery platform.

+23
+0.7%
3.2K
total stars
#241
lerocha/chinook-database

Sample database for SQL Server, Oracle, MySQL, PostgreSQL, SQLite, DB2

+23
+0.9%
2.5K
total stars
#242
tonbo-io/tonbo

Tonbo is an embedded database for serverless and edge runtimes, optimized for offline-first and big data use cases.

+23
+1.6%
1.5K
total stars
#243
ChawlaAvi/Daily-Dose-of-Data-Science

A collection of code snippets and tutorials for data science and data analysis in Python.

+23
+2.0%
1.2K
total stars
#244
arangodb/arangodb

ArangoDB is a multi-model database supporting documents, graphs, and key-values for high-performance applications.

+22
+0.2%
14.1K
total stars
#245
rapidsai/cudf

A high-performance GPU DataFrame library for data analysis and machine learning workloads.

+22
+0.2%
9.5K
total stars
#246
mage-ai/mage-ai

mage-ai is a Python-based platform for building, running, and managing data pipelines and integrating/transforming data.

+22
+0.3%
8.7K
total stars
#247
nalgeon/sqlean

The ultimate set of SQLite extensions for developers building applications with SQLite databases.

+22
+0.5%
4.3K
total stars
#248
linq2db/linq2db

Linq to database provider for .NET, supporting various database engines.

+22
+0.7%
3.2K
total stars
#249
posit-dev/great-tables

A Python library for creating easy-to-use, visually appealing data tables and summaries.

+22
+0.8%
2.6K
total stars
#250
neilotoole/sq

sq is a Go-based data wrangling tool that supports a variety of data formats and databases.

+22
+0.9%
2.5K
total stars
1...46...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.