Trending Projects

Discover the fastest growing open source projects

Showing 151-200 of 897 trending projects

#151

The Feldera Incremental Computation Engine is a Rust-based library for building real-time data pipelines and materialized views.

+216

+13.5%

1.8K

total stars

Rust

#152

debezium/debezium

An open-source framework for change data capture from various databases using Apache Kafka.

+215

+1.8%

12.5K

total stars

Java

#153

NateScarlet/holiday-cn

A Python tool for automatically scraping data on China's statutory holidays from government announcements.

+215

+13.2%

1.8K

total stars

Python

#154

networkx/networkx

networkx is a Python library for creating, manipulating, and studying the structure and dynamics of complex networks.

+211

+1.3%

16.7K

total stars

Python

#155

redis-windows/redis-windows

Redis 6.0.20 through 8.0.0 for Windows, a popular open-source in-memory data structure store.

+210

+6.3%

3.5K

total stars

Batchfile

#156

alexeygrigorev/data-science-interviews

A repository of data science interview questions and answers for developers.

+208

+2.2%

9.8K

total stars

HTML

#157

data-engineering-community/data-engineering-wiki

A community-driven wiki for learning data engineering, covering topics like data modeling, pipelines, and databases.

+207

+12.2%

1.9K

total stars

CSS

#158

1eez/103976

A comprehensive English word database with translations, parts of speech, and definitions for developers.

+206

+25.5%

1.0K

total stars

PLpgSQL

#159

dgraph-io/dgraph

High-performance distributed graph database for real-time use cases

+205

+1.0%

21.6K

total stars

#160

deanmalmgren/textract

A Python library that provides a simple and unified interface for extracting text from any document format.

+204

+4.8%

4.5K

total stars

HTML

#161

duckdb/duckdb-wasm

WebAssembly version of the DuckDB analytical database, enabling fast in-browser analytics and SQL queries.

+202

+11.7%

1.9K

total stars

C++

#162

dhamaniasad/awesome-postgres

A curated list of awesome PostgreSQL software, libraries, tools and resources.

+201

+1.7%

11.7K

total stars

#163

igorbarinov/awesome-data-engineering

A curated list of data engineering tools for software developers, not focused on AI coding tools.

+199

+2.4%

8.3K

total stars

#164

dr5hn/countries-states-cities-database

A comprehensive database of countries, states, and cities with data in multiple formats

+198

+2.2%

9.3K

total stars

Python

#165

probberechts/soccerdata

A Python library for scraping soccer data from various sources for sports analytics and data science.

+197

+14.1%

1.6K

total stars

Python

#166

eduosi/district

This repository contains data on Chinese administrative divisions, including names, pinyin, and codes.

+197

+22.5%

1.1K

total stars

#167

datahub-project/datahub

An open-source metadata platform for managing your data and AI stack across the enterprise.

+196

+1.7%

11.6K

total stars

Java

#168

ngaut/builddatabase

A distributed SQL database built from scratch, not focused on vibe coders or AI tools.

+196

+10.0%

2.1K

total stars

#169

questdb/questdb

QuestDB is a high-performance, open-source, time-series database for real-time analytics and financial applications.

+194

+1.2%

16.7K

total stars

Java

#170

FavioVazquez/ds-cheatsheets

A comprehensive collection of data science cheatsheets for developers and data scientists.

+194

+1.2%

16.2K

total stars

#171

risinglightdb/risinglight

An educational OLAP database system built in Rust for learning and experimentation.

+193

+11.9%

1.8K

total stars

Rust

#172

apache/bookkeeper

Apache BookKeeper is a scalable, fault tolerant and low latency storage service optimized for append-only workloads.

+192

+10.7%

2.0K

total stars

Java

#173

google/draco

Draco is a C++ library for compressing and decompressing 3D geometric meshes and point clouds.

+191

+2.7%

7.2K

total stars

C++

#174

typeorm/typeorm

ORM for TypeScript and JavaScript with support for multiple databases and platforms.

+189

+0.5%

36.4K

total stars

TypeScript

#175

FeatureBaseDB/featurebase

FeatureBase is a fast analytical database built on bitmaps, perfect for ML and data-intensive applications.

+189

+8.1%

2.5K

total stars

#176

nutsdb/nutsdb

A simple, fast, and embeddable key-value store written in Go that supports transactions and data structures.

+188

+5.6%

3.6K

total stars

#177

youssefHosni/Data-Science-Interview-Questions-Answers

A curated list of data science interview questions and answers for developers.

+187

+3.5%

5.5K

total stars

#178

Giorgi/EntityFramework.Exceptions

A .NET Standard library that provides strongly typed exceptions for Entity Framework Core across multiple database providers.

+186

+12.3%

1.7K

total stars

#179

koaning/drawdata

A Python library that allows developers to easily draw datasets within their notebooks.

+186

+12.8%

1.6K

total stars

JavaScript

#180

argoproj/argo-workflows

Argo Workflows is a powerful open-source workflow engine for Kubernetes, enabling complex data processing and machine learning pipelines.

+185

+1.1%

16.5K

total stars

#181

tonbo-io/tonbo

Tonbo is an embedded database for serverless and edge runtimes, optimized for offline-first and big data use cases.

+185

+14.1%

1.5K

total stars

Rust

#182

yhat/db.py

db.py is a Python library that provides an easier way to interact with your databases.

+184

+17.9%

1.2K

total stars

Python

#183

openmaptiles/openmaptiles

OpenMapTiles is an open-source vector tile schema implementation for creating custom map tiles.

+183

+6.5%

3.0K

total stars

PLpgSQL

#184

paulyoder/LinqToExcel

A library that allows developers to use LINQ to retrieve data from spreadsheets and CSV files.

+183

+20.8%

1.1K

total stars

#185

TA-Lib/ta-lib-python

Python wrapper for the TA-Lib technical analysis library, useful for financial pattern recognition.

+182

+1.6%

11.8K

total stars

Cython

#186

fugue-project/fugue

A unified interface for distributed computing on Spark, Dask and Ray without any rewrites.

+182

+9.3%

2.1K

total stars

Python

#187

dataprofessor/code

Compilation of R and Python programming codes for data science and machine learning projects.

+181

+21.3%

1.0K

total stars

Jupyter Notebook

#188

scipy/scipy

SciPy is a Python library for scientific and technical computing, providing a wide range of algorithms and tools.

+180

+1.3%

14.5K

total stars

Python

#189

nalepae/pandarallel

A parallel processing library for Pandas that improves performance on multi-core CPUs.

+180

+5.0%

3.8K

total stars

Python

#190

redis/go-redis

Redis client for Go with support for Redis 8.0+

+179

+0.8%

22.0K

total stars

#191

kuzudb/kuzu

Fast, embedded graph database with vector search and full-text search, compatible with Cypher queries.

+179

+5.0%

3.7K

total stars

C++

#192

man-group/arctic

A high-performance datastore for time series and tick data built on top of MongoDB.

+179

+6.2%

3.1K

total stars

Python

#193

skfolio/skfolio

A Python library for portfolio optimization using scikit-learn and convex optimization techniques.

+179

+10.5%

1.9K

total stars

Python

#194

zalando/spilo

Highly available PostgreSQL cluster using Docker, focused on data infrastructure for developers.

+179

+11.1%

1.8K

total stars

Python

#195

colour-science/colour

A comprehensive Python library for color science and color space conversions.

+178

+7.6%

2.5K

total stars

Python

#196

BlankerL/DXY-COVID-19-Data

A data warehouse for COVID-19 time series data, useful for data analysis and visualization.

+178

+8.9%

2.2K

total stars

Python

#197

apache/flink

Apache Flink is a stream processing framework for real-time and batch data processing.

+177

+0.7%

25.8K

total stars

Java

#198

materialsproject/pymatgen

A robust Python library for materials analysis and computational materials science.

+176

+10.7%

1.8K

total stars

Python

#199

dgraph-io/badger

Fast, embeddable key-value database written in Go for building high-performance storage applications.

+175

+1.1%

15.5K

total stars

#200

scylladb/scylladb

A high-performance NoSQL data store compatible with Apache Cassandra and Amazon DynamoDB.

+175

+1.1%

15.4K

total stars

C++

1...35...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.