Trending Projects

Discover the fastest growing open source projects

Showing 201-250 of 897 trending projects

#201
tikv/tikv

Distributed transactional key-value database, originally created to complement TiDB

+235
+1.4%
16.6K
total stars
#202
opengeospatial/geoparquet

A specification for storing geospatial vector data (point, line, polygon) in the Parquet file format, enabling efficient cloud-native geospatial data processing.

+234
+29.6%
1.0K
total stars
#203
oxnr/awesome-bigdata

A curated list of awesome big data frameworks, resources and other awesomeness.

+232
+1.6%
14.3K
total stars
#204
garden-co/jazz

A distributed database with CRDT sync, offline support, and end-to-end encryption for vibe coders.

+232
+10.4%
2.5K
total stars
#205
dgraph-io/badger

Fast, embeddable key-value database written in Go for building high-performance storage applications.

+231
+1.5%
15.5K
total stars
#206
pudo/dataset

Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.

+231
+5.0%
4.9K
total stars
#207
lit26/finvizfinance

A Python library for financial analysis and data scraping from the Finviz platform.

+227
+22.7%
1.2K
total stars
#208
hannorein/rebound

An open-source N-body simulation library for astrophysics and planetary science.

+226
+27.7%
1.0K
total stars
#209
plotters-rs/plotters

A high-quality, cross-platform data plotting library for Rust developers, including WebAssembly support.

+224
+5.2%
4.5K
total stars
#210
statsmodels/statsmodels

Statsmodels is a Python library for statistical modeling and econometrics, providing tools for data analysis and prediction.

+222
+2.0%
11.3K
total stars
#211
orioledb/orioledb

OrioleDB is a cloud-native PostgreSQL extension that solves performance and scalability challenges.

+222
+5.9%
4.0K
total stars
#212
felt/tippecanoe

Build vector tilesets from large collections of GeoJSON features.

+222
+18.1%
1.4K
total stars
#213
vesoft-inc/nebula

Nebula is a fast, open-source, distributed graph database with horizontal scalability and high availability.

+221
+1.9%
12.1K
total stars
#214
dineug/erd-editor

An open-source, TypeScript-based Entity-Relationship Diagram (ERD) editor for developers working with databases.

+220
+16.0%
1.6K
total stars
#215
delta-io/delta

An open-source data lakehouse framework that enables building data pipelines with leading big data compute engines.

+219
+2.6%
8.6K
total stars
#216
databendlabs/databend

Unified cloud-native data warehouse platform for analytics, search and AI, built on top of S3 storage.

+218
+2.4%
9.2K
total stars
#217
wireservice/csvkit

A suite of utilities for converting to and working with CSV, the king of tabular file formats.

+218
+3.5%
6.4K
total stars
#218
dpilger26/NumCpp

A C++ implementation of the Python NumPy library for scientific computing and numerical analysis.

+209
+5.6%
3.9K
total stars
#219
nalgeon/redka

A Redis-compatible database implemented in Go, supporting SQL and multiple backends like PostgreSQL and SQLite.

+208
+4.8%
4.5K
total stars
#220
veb-101/Data-Science-Projects

A collection of data science projects in Python using Jupyter Notebook.

+207
+8.8%
2.6K
total stars
#221
apache/seatunnel

A high-performance, distributed data integration tool for batch, streaming, and CDC use cases.

+204
+2.3%
9.1K
total stars
#222
chezou/tabula-py

A simple Python wrapper for the Tabula Java library, which extracts tables from PDF files into Pandas DataFrames.

+204
+9.7%
2.3K
total stars
#223
bruin-data/bruin

A data platform that enables building data pipelines with SQL, Python, and ingesting from various sources.

+204
+16.6%
1.4K
total stars
#224
ptyadana/SQL-Data-Analysis-and-Visualization-Projects

This GitHub repository contains SQL data analysis and visualization projects using various tools and databases.

+201
+13.7%
1.7K
total stars
#225
elastic/kibana

Kibana is an open-source data visualization and management tool for Elasticsearch

+198
+0.9%
21.0K
total stars
#226
benbjohnson/thesecretlivesofdata

A JavaScript library for visualizing and understanding complex data structures.

+198
+5.8%
3.6K
total stars
#227
ngaut/builddatabase

A distributed SQL database built from scratch, not focused on vibe coders or AI tools.

+196
+10.0%
2.1K
total stars
#228
xtensor-stack/xtensor

A C++ library for multidimensional array operations with broadcasting and lazy computing.

+195
+5.5%
3.7K
total stars
#229
brandon-rhodes/pycon-pandas-tutorial

A tutorial for using the popular Python data analysis library Pandas, presented at PyCon 2015.

+195
+22.3%
1.1K
total stars
#230
ContextLab/hypertools

A Python toolbox for gaining geometric insights into high-dimensional data, useful for vibe coders working with AI tools.

+194
+11.5%
1.9K
total stars
#231
SheetJS/sheetjs

SheetJS Spreadsheet Data Toolkit for data extraction and spreadsheet generation.

+189
+0.5%
36.2K
total stars
#232
synthetichealth/synthea

Synthea is an open-source synthetic patient population simulator for generating realistic healthcare data.

+186
+6.6%
3.0K
total stars
#233
dolthub/go-mysql-server

A MySQL-compatible relational database with a storage agnostic query engine, implemented in Go.

+186
+7.7%
2.6K
total stars
#234
microsoft/sql-server-samples

This repository contains code samples for SQL Server, Azure SQL, and related data services from Microsoft.

+183
+1.7%
10.9K
total stars
#235
PeerDB-io/peerdb

Fast, cost-effective data replication tool from Postgres to data warehouses, queues, and storage

+183
+6.5%
3.0K
total stars
#236
hosseinmoein/DataFrame

C++ DataFrame library for statistical, financial, and machine learning analysis.

+183
+6.7%
2.9K
total stars
#237
apache/fluss

Apache Fluss is a real-time streaming storage platform built for big data analytics.

+182
+11.2%
1.8K
total stars
#238
TobikoData/sqlmesh

Scalable and efficient data transformation framework with backwards compatibility for dbt.

+181
+6.6%
2.9K
total stars
#239
treeverse/lakeFS

lakeFS is a Git-like version control system for data lakes, enabling data engineers to manage data versioning and data quality.

+180
+3.6%
5.2K
total stars
#240
josonle/Coding-Now

A collection of study notes, ebooks, and resources on big data, machine learning, Linux, and more for developers.

+180
+20.8%
1.0K
total stars
#241
yougov/mongo-connector

MongoDB data stream pipeline tools for managing real-time data synchronization and replication.

+178
+10.5%
1.9K
total stars
#242
kedro-org/kedro

Kedro is a Python toolkit for building production-ready data science and machine learning pipelines.

+177
+1.7%
10.8K
total stars
#243
huachaohuang/awesome-dbdev

A curated list of awesome materials and resources for database development.

+177
+12.5%
1.6K
total stars
#244
gunrock/gunrock

Programmable CUDA/C++ GPU Graph Analytics library for high-performance parallel graph processing.

+177
+19.9%
1.1K
total stars
#245
TurboWay/bigdata_analyse

This is a Python project for big data analysis, focusing on HQL, SQL, and data processing.

+176
+3.6%
5.0K
total stars
#246
intake/intake

Intake is a lightweight Python package for discovering, investigating, loading and distributing data.

+176
+19.7%
1.1K
total stars
#247
Tencent/wcdb

WCDB is a cross-platform database framework developed by WeChat for Android, iOS, Linux, macOS, and Windows.

+173
+1.5%
11.7K
total stars
#248
PRQL/prql

PRQL is a modern, powerful, and pipelined SQL replacement for transforming data.

+173
+1.6%
10.7K
total stars
#249
mattn/go-sqlite3

A lightweight SQLite3 driver for Go that implements the database/sql interface.

+173
+2.0%
9.0K
total stars
#250
OSGeo/gdal

GDAL is an open-source library for working with various geospatial data formats, useful for remote sensing and GIS applications.

+173
+3.1%
5.8K
total stars
1...46...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.