Trending Projects

Discover the fastest growing open source projects

Showing 151-200 of 897 trending projects

#151
groue/GRDB.swift

A toolkit for SQLite databases, focused on application development with a Swift-based API.

+705
+9.3%
8.3K
total stars
#152
mpquant/Ashare

A free, open-source Python library for fetching real-time stock data from Chinese stock exchanges.

+701
+28.4%
3.2K
total stars
#153
yougov/mongo-connector

MongoDB data stream pipeline tools for managing real-time data synchronization and replication.

+699
+59.4%
1.9K
total stars
#154
timescale/pgvectorscale

A Postgres extension for high-performance vector search, complementing pgvector for scale.

+694
+31.4%
2.9K
total stars
#155
matplotlib/AnatomyOfMatplotlib

Anatomy of Matplotlib tutorial for SciPy conference, focused on data visualization for scientific computing.

+694
+128.5%
1.2K
total stars
#156
pubkey/rxdb

Reactive, local-first database for JavaScript apps with real-time sync and flexible storage

+692
+3.1%
23.1K
total stars
#157
dataprofessor/code

Compilation of R and Python programming codes for data science and machine learning projects.

+687
+200.9%
1.0K
total stars
#158
oceanbase/oceanbase

A fast, scalable, and distributed database for transactional, analytical, and AI workloads.

+686
+7.4%
10.0K
total stars
#159
memgraph/memgraph

Open-source graph database optimized for dynamic analytics and streaming data environments.

+686
+22.3%
3.8K
total stars
#160
tikv/tikv

Distributed transactional key-value database, originally created to complement TiDB

+682
+4.3%
16.6K
total stars
#161
databendlabs/databend

Unified cloud-native data warehouse platform for analytics, search and AI, built on top of S3 storage.

+653
+7.7%
9.2K
total stars
#162
alibaba/canal

MySQL binlog incremental subscription and consumption component

+652
+2.3%
29.6K
total stars
#163
great-expectations/great_expectations

A Python library that helps ensure data quality and reliability through data profiling and testing.

+650
+6.2%
11.2K
total stars
#164
nalgeon/redka

A Redis-compatible database implemented in Go, supporting SQL and multiple backends like PostgreSQL and SQLite.

+650
+16.7%
4.5K
total stars
#165
vesoft-inc/nebula

Nebula is a fast, open-source, distributed graph database with horizontal scalability and high availability.

+648
+5.7%
12.1K
total stars
#166
snowplow/snowplow

A powerful customer data pipeline for collecting, processing, and analyzing user events and behavior.

+642
+10.1%
7.0K
total stars
#167
pingcap/awesome-database-learning

A comprehensive list of learning materials to help developers understand database internals.

+641
+6.4%
10.7K
total stars
#168
vortex-data/vortex

An extensible, high-performance columnar file format for data storage and processing.

+641
+30.2%
2.8K
total stars
#169
garden-co/jazz

A distributed database with CRDT sync, offline support, and end-to-end encryption for vibe coders.

+639
+35.0%
2.5K
total stars
#170
markwk/qs_ledger

A personal data aggregator and analysis tool for self-tracking and quantified self enthusiasts.

+631
+148.5%
1.1K
total stars
#171
opengeos/streamlit-geospatial

A multi-page Streamlit app for geospatial data visualization and analysis, useful for housing and real estate applications.

+628
+164.4%
1.0K
total stars
#172
zhu-xlab/GlobalBuildingAtlas

GlobalBuildingAtlas is an open global and complete dataset of building polygons, heights and LoD1 3D models.

+627
+46.1%
2.0K
total stars
#173
Kotlin/dataframe

A Kotlin library for structured data processing, suitable for data analysis and data science tasks.

+627
+155.2%
1.0K
total stars
#174
SheetJS/sheetjs

SheetJS Spreadsheet Data Toolkit for data extraction and spreadsheet generation.

+626
+1.8%
36.2K
total stars
#175
simonw/datasette

An open-source multi-tool for exploring and publishing data, focused on simplifying data analysis and sharing.

+617
+6.1%
10.8K
total stars
#176
the-pudding/data

A repository of open-source data sets created for stories on The Pudding, a digital publication focused on data journalism.

+616
+141.0%
1.1K
total stars
#177
TurboWay/bigdata_analyse

This is a Python project for big data analysis, focusing on HQL, SQL, and data processing.

+597
+13.4%
5.0K
total stars
#178
datazip-inc/olake

Fastest open-source data pipeline tool for replicating databases to data lakes in Apache Iceberg format.

+596
+84.1%
1.3K
total stars
#179
taynaud/python-louvain

A Python library for implementing the Louvain community detection algorithm on graphs.

+594
+134.1%
1.0K
total stars
#180
opendataloader-project/opendataloader-pdf

Fast local PDF-to-Markdown/JSON converter for RAG pipelines. No GPU needed.

+591
+47.5%
1.8K
total stars
#181
fjall-rs/fjall

A high-performance, embeddable key-value storage engine written in Rust for developers building data-intensive applications.

+590
+44.3%
1.9K
total stars
#182
elastic/kibana

Kibana is an open-source data visualization and management tool for Elasticsearch

+589
+2.9%
21.0K
total stars
#183
statsmodels/statsmodels

Statsmodels is a Python library for statistical modeling and econometrics, providing tools for data analysis and prediction.

+589
+5.5%
11.3K
total stars
#184
apache/seatunnel

A high-performance, distributed data integration tool for batch, streaming, and CDC use cases.

+581
+6.8%
9.1K
total stars
#185
li6185377/LKDBHelper-SQLite-ORM

An automatic database ORM library for Objective-C that provides thread-safe and deadlock-free database operations.

+574
+90.0%
1.2K
total stars
#186
mysql/mysql-connector-j

MySQL Connector/J is a JDBC driver that enables Java applications to connect to MySQL databases.

+574
+130.8%
1.0K
total stars
#187
dathere/qsv

Blazing-fast data wrangling toolkit for AI and data engineering workflows

+572
+19.4%
3.5K
total stars
#188
andkret/Cookbook

A comprehensive cookbook for data engineers, covering best practices, big data, and data engineering concepts.

+571
+4.0%
15.0K
total stars
#189
redisson/redisson

Redisson is a Java client for Redis and Valkey with distributed objects and services

+557
+2.4%
24.3K
total stars
#190
mathesar-foundation/mathesar

An open-source, self-hosted database management tool with a spreadsheet-like interface for Postgres

+556
+12.9%
4.9K
total stars
#191
dask/dask

Dask is a Python library for parallel computing and distributed data processing, providing a scalable alternative to NumPy and Pandas.

+553
+4.2%
13.8K
total stars
#192
zhihu/kids

A C++ library for processing data streams, potentially useful for vibe coders working with AI-powered tools.

+553
+82.5%
1.2K
total stars
#193
github/covid19-dashboard

An open-source COVID-19 dashboard powered by the fastpages framework, featuring data visualizations.

+548
+50.6%
1.6K
total stars
#194
grantjenks/python-sortedcontainers

A Python library that provides efficient, Pythonic data structures for sorted lists, dictionaries, and sets.

+542
+16.0%
3.9K
total stars
#195
oxnr/awesome-bigdata

A curated list of awesome big data frameworks, resources and other awesomeness.

+540
+3.9%
14.3K
total stars
#196
delta-io/delta

An open-source data lakehouse framework that enables building data pipelines with leading big data compute engines.

+534
+6.6%
8.6K
total stars
#197
heibaiying/BigData-Notes

A comprehensive guide to big data technologies like Hadoop, Spark, Kafka, and more for developers.

+533
+3.3%
16.9K
total stars
#198
devrimgunduz/pagila

A PostgreSQL sample database for testing and learning SQL queries.

+533
+107.0%
1.0K
total stars
#199
TobikoData/sqlmesh

Scalable and efficient data transformation framework with backwards compatibility for dbt.

+532
+22.2%
2.9K
total stars
#200
skaiworldwide-oss/agensgraph

AgensGraph is a transactional graph database based on PostgreSQL for enterprise-level applications.

+514
+53.3%
1.5K
total stars
1...35...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.