Trending Projects

Discover the fastest growing open source projects

Showing 51-100 of 897 trending projects

#51
facebook/rocksdb

Embeddable, persistent key-value store for fast storage with LSM design

+1.6K
+5.4%
31.6K
total stars
#52
pymupdf/PyMuPDF

A high-performance Python library for data extraction, analysis, conversion and manipulation of PDF and other documents.

+1.6K
+20.9%
9.2K
total stars
#53
fluvio-community/fluvio

Fluvio is an event stream processing engine for developers to build responsive data-intensive apps.

+1.6K
+43.5%
5.2K
total stars
#54
treeverse/dvc

dvc is a data versioning and ML experiments tool that helps developers manage and track data and model changes.

+1.5K
+10.6%
15.4K
total stars
#55
dgraph-io/dgraph

High-performance distributed graph database for real-time use cases

+1.4K
+7.2%
21.6K
total stars
#56
1nchaos/adata

Open-source, free A-share quantitative trading data platform focused on China's stock market

+1.4K
+54.7%
4.0K
total stars
#57
compose/transporter

Transporter is a powerful ETL tool that allows developers to sync data between various persistence engines.

+1.4K
+2794.0%
1.4K
total stars
#58
influxdata/influxdb

Time-series database for metrics & analytics

+1.3K
+4.5%
31.4K
total stars
#59
kuzudb/kuzu

Fast, embedded graph database with vector search and full-text search, compatible with Cypher queries.

+1.3K
+55.8%
3.7K
total stars
#60
apache/doris

Apache Doris is a high-performance, unified analytics database for real-time data processing.

+1.3K
+9.7%
15.1K
total stars
#61
pingcap/tidb

Cloud-native distributed SQL database for modern applications

+1.3K
+3.4%
39.9K
total stars
#62
duckdb/ducklake

DuckLake is an integrated data lake and catalog format written in C++.

+1.3K
+109.9%
2.5K
total stars
#63
google/leveldb

Fast key-value storage library for C++

+1.3K
+3.4%
38.9K
total stars
#64
questdb/questdb

QuestDB is a high-performance, open-source, time-series database for real-time analytics and financial applications.

+1.3K
+8.2%
16.7K
total stars
#65
x-ream/sqli

A Java ORM SQL query builder that supports popular databases like ClickHouse, Impala, MySQL, and Presto.

+1.3K
+217.8%
1.9K
total stars
#66
sqlite/sqlite

Official Git mirror of the SQLite source tree, a popular and widely-used embedded database engine.

+1.2K
+15.5%
9.1K
total stars
#67
juicedata/juicefs

JuiceFS is a distributed POSIX file system built on top of Redis and S3 for big data and cloud-native applications.

+1.2K
+10.1%
13.3K
total stars
#68
sqlitebrowser/sqlitebrowser

SQLite database management tool with GUI

+1.2K
+5.4%
23.7K
total stars
#69
dbt-labs/dbt-core

dbt enables data analysts and engineers to transform data using software engineering practices.

+1.2K
+10.8%
12.3K
total stars
#70
StarRocks/starrocks

A high-performance open source query engine for sub-second analytics on data lakehouse.

+1.2K
+11.6%
11.4K
total stars
#71
apache/hamilton

Hamilton is an open-source ETL framework that helps data scientists and engineers build modular, testable dataflows with lineage and metadata.

+1.1K
+90.1%
2.4K
total stars
#72
apache/iceberg

Apache Iceberg is an open-source table format for large analytic datasets, providing a versioned and scalable data lake architecture.

+1.1K
+15.3%
8.6K
total stars
#73
mongodb/mongo

MongoDB database server and tools

+1.1K
+4.2%
28.2K
total stars
#74
js-data/js-data

A framework-agnostic, datastore-agnostic JavaScript ORM built for ease of use and peace of mind.

+1.1K
+225.5%
1.6K
total stars
#75
typeorm/typeorm

ORM for TypeScript and JavaScript with support for multiple databases and platforms.

+1.1K
+3.2%
36.4K
total stars
#76
theOehrly/Fast-F1

A Python package for accessing and analyzing Formula 1 racing data, including results, schedules, timing, and telemetry.

+1.1K
+32.9%
4.5K
total stars
#77
vitessio/vitess

Distributed MySQL database system for horizontal scaling

+1.1K
+5.7%
20.8K
total stars
#78
cockroachdb/cockroach

Distributed SQL database for cloud-native apps

+1.1K
+3.6%
32.0K
total stars
#79
trinodb/trino

Trino is a distributed SQL query engine for big data, allowing fast, scalable, and cost-effective analytics.

+1.1K
+9.5%
12.6K
total stars
#80
paradedb/paradedb

A Rust-based, Elasticsearch-quality search engine for PostgreSQL, enabling fast, real-time analytics and HTAP use cases.

+1.1K
+14.8%
8.5K
total stars
#81
apache/datafusion

Apache DataFusion is a powerful SQL query engine written in Rust, designed for big data processing and analysis.

+1.1K
+14.7%
8.5K
total stars
#82
alibaba/AliSQL

AliSQL is a MySQL branch originated from Alibaba Group, focused on high performance and scalability.

+1.1K
+23.2%
5.8K
total stars
#83
dlt-hub/dlt

An open-source Python library that simplifies the process of loading data into data lakes and warehouses.

+1.1K
+27.4%
5.0K
total stars
#84
redis-windows/redis-windows

Redis 6.0.20 through 8.0.0 for Windows, a popular open-source in-memory data structure store.

+1.1K
+43.4%
3.5K
total stars
#85
mukunku/ParquetViewer

A simple Windows desktop app for viewing and querying Apache Parquet files, a popular big data format.

+1.1K
+2134.0%
1.1K
total stars
#86
orbitinghail/graft

Graft is an open-source transactional storage engine optimized for lazy, partial, and strongly consistent replication, ideal for edge, offline-first, and distributed applications.

+1.0K
+280.1%
1.4K
total stars
#87
marcboeker/go-duckdb

A Go database/sql driver for the DuckDB database engine, enabling fast and efficient data processing.

+1.0K
+2011.8%
1.1K
total stars
#88
youssefHosni/Data-Science-Interview-Questions-Answers

A curated list of data science interview questions and answers for developers.

+1.0K
+22.8%
5.5K
total stars
#89
torodb/stampede

A database solution that provides better analytics on top of MongoDB and makes it easier to migrate from MongoDB to SQL.

+1.0K
+139.8%
1.8K
total stars
#90
redis/go-redis

Redis client for Go with support for Redis 8.0+

+1.0K
+4.8%
22.0K
total stars
#91
apache/arrow

Apache Arrow is a fast columnar data format and toolset for in-memory analytics and data interchange.

+1.0K
+6.5%
16.6K
total stars
#92
SciRuby/daru

SciRuby/daru is a Ruby library for data analysis and manipulation, useful for data scientists and developers working with data.

+1.0K
+1980.4%
1.1K
total stars
#93
databricks/spark-csv

CSV Data Source for Apache Spark 1.x, a Scala library for working with structured data.

+1.0K
+1724.1%
1.1K
total stars
#94
apache/celeborn

Apache Celeborn is a high-performance shuffle and spilled data service for big data applications.

+989
+1978.0%
1.0K
total stars
#95
facebookresearch/cc_net

Tools to download and cleanup Common Crawl data, a large web crawl dataset, for further analysis and processing.

+988
+1976.0%
1.0K
total stars
#96
CJ-Chen/TBtools-II

A powerful GUI/CLI tool for biologists to work with NGS data, not a vibe coder tool.

+981
+1962.0%
1.0K
total stars
#97
google/or-tools

Google's Operations Research tools for combinatorial optimization, linear programming, and operations research.

+971
+8.0%
13.2K
total stars
#98
KeithGalli/pandas

A Python library for data manipulation and analysis, part of the core data science toolkit.

+969
+1076.7%
1.1K
total stars
#99
ranaroussi/quantstats

Portfolio analytics library for quantitative finance, built with Python

+968
+16.6%
6.8K
total stars
#100
allenai/s2orc

A large-scale open-access corpus of scientific papers and metadata for researchers and developers.

+968
+1936.0%
1.0K
total stars
13...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.