Trending Projects

Discover the fastest growing open source projects

Showing 51-100 of 897 trending projects

#51

dgraph-io/dgraph

High-performance distributed graph database for real-time use cases

0.0%

21.6K

total stars

#52

chartdb/chartdb

Web-based database diagramming editor with AI-powered export and schema import

0.0%

21.4K

total stars

TypeScript

#53

valeriansaliou/sonic

Fast, lightweight search backend alternative to Elasticsearch

0.0%

21.2K

total stars

Rust

#54

elastic/kibana

Kibana is an open-source data visualization and management tool for Elasticsearch

0.0%

21.0K

total stars

TypeScript

#55

vitessio/vitess

Distributed MySQL database system for horizontal scaling

0.0%

20.8K

total stars

#56

airbytehq/airbyte

Data integration platform for ELT pipelines from APIs, databases & files to databases, warehouses & lakes

0.0%

20.8K

total stars

Python

#57

apache/shardingsphere

Distributed SQL database middleware for sharding, scalability, and security

0.0%

20.7K

total stars

Java

#58

dolthub/dolt

Dolt is Git for Data, enabling version control for SQL databases with Git-like commands and features.

0.0%

20.5K

total stars

#59

mybatis/mybatis-3

MyBatis SQL Mapper for Java simplifies database interactions with object mapping.

0.0%

20.4K

total stars

Java

#60

knex/knex

SQL query builder for multiple databases

0.0%

20.2K

total stars

JavaScript

#61

postgres/postgres

PostgreSQL database source code

0.0%

20.2K

total stars

#62

pgvector/pgvector

Vector similarity search for Postgres

0.0%

20.1K

total stars

#63

tursodatabase/turso

Turso is an in-process SQL database, compatible with SQLite, written in Rust for high performance.

0.0%

17.7K

total stars

Rust

#64

rqlite/rqlite

A lightweight, fault-tolerant distributed database built on SQLite, designed for high availability.

0.0%

17.3K

total stars

#65

fivethirtyeight/data

A data repository for the data journalism site FiveThirtyEight, containing data and code behind their articles and graphics.

0.0%

17.3K

total stars

Jupyter Notebook

#66

heibaiying/BigData-Notes

A comprehensive guide to big data technologies like Hadoop, Spark, Kafka, and more for developers.

0.0%

16.9K

total stars

Java

#67

akfamily/akshare

AKShare is a simple and elegant Python library for accessing financial data APIs.

0.0%

16.8K

total stars

Python

#68

questdb/questdb

QuestDB is a high-performance, open-source, time-series database for real-time analytics and financial applications.

0.0%

16.7K

total stars

Java

#69

networkx/networkx

networkx is a Python library for creating, manipulating, and studying the structure and dynamics of complex networks.

0.0%

16.7K

total stars

Python

#70

prestodb/presto

Presto is an open-source distributed SQL query engine for big data, allowing fast analysis of large datasets.

0.0%

16.7K

total stars

Java

#71

apache/arrow

Apache Arrow is a fast columnar data format and toolset for in-memory analytics and data interchange.

0.0%

16.6K

total stars

C++

#72

tikv/tikv

Distributed transactional key-value database, originally created to complement TiDB

0.0%

16.6K

total stars

Rust

#73

argoproj/argo-workflows

Argo Workflows is a powerful open-source workflow engine for Kubernetes, enabling complex data processing and machine learning pipelines.

0.0%

16.5K

total stars

#74

tursodatabase/libsql

libSQL is an open-source, open-contribution fork of SQLite, a widely used embedded database.

0.0%

16.4K

total stars

#75

prisma/prisma1

Prisma1 is a database toolkit with an ORM, migrations, and admin UI for Postgres, MySQL, and MongoDB.

0.0%

16.4K

total stars

Scala

#76

FavioVazquez/ds-cheatsheets

A comprehensive collection of data science cheatsheets for developers and data scientists.

0.0%

16.2K

total stars

#77

apple/foundationdb

FoundationDB is an open-source, distributed, transactional key-value store that provides ACID guarantees.

0.0%

16.2K

total stars

C++

#78

dgraph-io/badger

Fast, embeddable key-value database written in Go for building high-performance storage applications.

0.0%

15.5K

total stars

#79

treeverse/dvc

dvc is a data versioning and ML experiments tool that helps developers manage and track data and model changes.

0.0%

15.4K

total stars

Python

#80

scylladb/scylladb

A high-performance NoSQL data store compatible with Apache Cassandra and Amazon DynamoDB.

0.0%

15.4K

total stars

C++

#81

apache/doris

Apache Doris is a high-performance, unified analytics database for real-time data processing.

0.0%

15.1K

total stars

Java

#82

dagster-io/dagster

An open-source data orchestration platform for developing, running, and observing data pipelines and workflows.

0.0%

15.1K

total stars

Python

#83

zhisheng17/flink-learning

This is a comprehensive learning resource for the Flink stream processing framework, covering concepts, principles, and real-world use cases.

0.0%

15.1K

total stars

Java

#84

cayleygraph/cayley

An open-source graph database written in Go, useful for building applications that require linked data and graph-based queries.

0.0%

15.0K

total stars

#85

andkret/Cookbook

A comprehensive cookbook for data engineers, covering best practices, big data, and data engineering concepts.

0.0%

15.0K

total stars

Python

#86

waditu/tushare

A Python library for crawling historical data of China stocks.

0.0%

14.5K

total stars

Python

#87

scipy/scipy

SciPy is a Python library for scientific and technical computing, providing a wide range of algorithms and tools.

0.0%

14.5K

total stars

Python

#88

oxnr/awesome-bigdata

A curated list of awesome big data frameworks, resources and other awesomeness.

0.0%

14.3K

total stars

#89

arangodb/arangodb

ArangoDB is a multi-model database supporting documents, graphs, and key-values for high-performance applications.

0.0%

14.1K

total stars

C++

#90

dexie/Dexie.js

Dexie.js is a minimalistic IndexedDB wrapper that simplifies offline storage and database management in web applications.

0.0%

14.1K

total stars

TypeScript

#91

apache/druid

Apache Druid is a high-performance real-time analytics database for vibe coders working with data-intensive applications.

0.0%

13.9K

total stars

Java

#92

dask/dask

Dask is a Python library for parallel computing and distributed data processing, providing a scalable alternative to NumPy and Pandas.

0.0%

13.8K

total stars

Python

#93

sql-js/sql.js

A JavaScript library that allows you to run SQLite on the web, enabling local database functionality for web apps.

0.0%

13.6K

total stars

JavaScript

#94

Data-Centric-AI-Community/ydata-profiling

A Python library for fast, customizable, and interactive data profiling and exploratory data analysis.

0.0%

13.4K

total stars

Python

#95

juicedata/juicefs

JuiceFS is a distributed POSIX file system built on top of Redis and S3 for big data and cloud-native applications.

0.0%

13.3K

total stars

#96

google/or-tools

Google's Operations Research tools for combinatorial optimization, linear programming, and operations research.

0.0%

13.2K

total stars

C++

#97

datastacktv/data-engineer-roadmap

This is a roadmap for becoming a data engineer, not a developer discovery platform for vibe coders.

0.0%

12.7K

total stars

#98

trinodb/trino

Trino is a distributed SQL query engine for big data, allowing fast, scalable, and cost-effective analytics.

0.0%

12.6K

total stars

Java

#99

debezium/debezium

An open-source framework for change data capture from various databases using Apache Kafka.

0.0%

12.5K

total stars

Java

#100

citusdata/citus

Citus is a distributed PostgreSQL database that enables scaling out your Postgres database across multiple nodes.

0.0%

12.3K

total stars

13...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.