Trending Projects

Discover the fastest growing open source projects

Showing 351-400 of 897 trending projects

#351

olric-data/olric

Olric is a distributed, in-memory key/value store and cache for Go applications and services.

+13

+0.4%

3.4K

total stars

#352

CamDavidsonPilon/lifelines

A Python library for survival analysis, useful for developers working with time-to-event data.

+13

+0.5%

2.6K

total stars

Python

#353

san089/goodreads_etl_pipeline

An end-to-end data pipeline for building a data lake, data warehouse, and analytics platform from GoodReads data.

+13

+0.9%

1.5K

total stars

Python

#354

quantopian/empyrical

A Python library that provides common financial risk and performance metrics used in financial analysis.

+13

+0.9%

1.5K

total stars

Python

#355

DrTimothyAldenDavis/SuiteSparse

A powerful suite of sparse matrix algorithms and libraries for scientific and numerical computing.

+13

+0.9%

1.5K

total stars

#356

LongOnly/Quantitative-Notebooks

Educational notebooks on quantitative finance, algorithmic trading, financial modeling, and investment strategy.

+13

+1.0%

1.3K

total stars

Jupyter Notebook

#357

submato/xhscrawl

A web scraping tool for collecting data from Xiaohongshu, Bilibili, and other Chinese social platforms.

+13

+1.1%

1.3K

total stars

#358

opengeos/Awesome-GEE

A curated list of Google Earth Engine resources for geospatial analysis and remote sensing applications.

+13

+1.1%

1.2K

total stars

#359

RUCAIBox/RecSysDatasets

A repository of public data sources for building and testing recommender systems.

+13

+1.1%

1.2K

total stars

Python

#360

zinggAI/zingg

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

+13

+1.1%

1.2K

total stars

Java

#361

moshi4/pyCirclize

A Python library for creating circular data visualizations like Circos plots, chord diagrams, and radar charts.

+13

+1.3%

1.1K

total stars

Python

#362

google/cluster-data

This is a dataset of Borg cluster traces from Google, which can be useful for researchers and developers in the field of distributed systems and cloud infrastructure.

+13

+1.3%

1.0K

total stars

TeX

#363

OHDSI/CommonDataModel

A definition and DDLs for the OMOP Common Data Model (CDM), a data model for healthcare data.

+13

+1.3%

1.0K

total stars

HTML

#364

allenai/s2orc

A large-scale open-access corpus of scientific papers and metadata for researchers and developers.

+13

+1.3%

1.0K

total stars

Python

#365

typicode/lowdb

Lightweight local JSON database for JavaScript/TypeScript apps

+12

+0.1%

22.5K

total stars

JavaScript

#366

heibaiying/BigData-Notes

A comprehensive guide to big data technologies like Hadoop, Spark, Kafka, and more for developers.

+12

+0.1%

16.9K

total stars

Java

#367

jupyter/docker-stacks

Docker images containing Jupyter applications for data science and machine learning workflows.

+12

+0.1%

8.4K

total stars

Python

#368

qinwf/awesome-R

A curated list of awesome R packages, frameworks and software for data analysis and data science.

+12

+0.2%

6.4K

total stars

#369

dunwu/db-tutorial

An in-depth tutorial covering mainstream database knowledge for backend developers.

+12

+0.2%

5.3K

total stars

Java

#370

alandefreitas/matplotplusplus

Matplot++: A C++ graphics library for creating high-quality data visualizations and scientific plots.

+12

+0.3%

4.8K

total stars

C++

#371

GoogleTrends/data

An open-source index of Google Trends data, useful for developers building data-driven applications.

+12

+0.3%

4.8K

total stars

JavaScript

#372

jitsucom/jitsu

Open-source data pipeline engine for real-time ETL, connecting data sources to warehouses like BigQuery, Snowflake, Redshift.

+12

+0.3%

4.7K

total stars

TypeScript

#373

first20hours/google-10000-english

This repo contains a list of the 10,000 most common English words, useful for NLP and language modeling tasks.

+12

+0.3%

4.3K

total stars

#374

canonical/dqlite

An embeddable, replicated, and fault-tolerant SQL engine for building robust and scalable applications.

+12

+0.3%

4.3K

total stars

#375

ApsaraDB/PolarDB-for-PostgreSQL

A cloud-native PostgreSQL database developed by Alibaba Cloud for high-performance, scalable data storage and management.

+12

+0.4%

3.1K

total stars

#376

dolthub/go-mysql-server

A MySQL-compatible relational database with a storage agnostic query engine, implemented in Go.

+12

+0.5%

2.6K

total stars

#377

mdeff/fma

A dataset for music analysis and research, with support for deep learning and reproducible research.

+12

+0.5%

2.6K

total stars

Jupyter Notebook

#378

GanjinZero/awesome_Chinese_medical_NLP

A curated collection of open-source Chinese medical NLP resources including datasets, models, and more.

+12

+0.5%

2.5K

total stars

#379

malloydata/malloy

Malloy is an open-source language for describing data relationships and transformations.

+12

+0.5%

2.4K

total stars

TypeScript

#380

mwaskom/seaborn-data

This is a data repository for the Seaborn data visualization library in Python.

+12

+0.7%

1.8K

total stars

Python

#381

zalando/spilo

Highly available PostgreSQL cluster using Docker, focused on data infrastructure for developers.

+12

+0.7%

1.8K

total stars

Python

#382

polarsignals/frostdb

A fast, embeddable column database written in Go, optimized for AI/ML workloads.

+12

+0.8%

1.5K

total stars

#383

google/tensorstore

A C++ library for reading and writing large multi-dimensional arrays, useful for scientific and data-intensive applications.

+12

+0.8%

1.5K

total stars

C++

#384

percona/percona-toolkit

Percona Toolkit is a collection of advanced open source database tools for MySQL, MongoDB, and PostgreSQL.

+12

+0.8%

1.5K

total stars

Perl

#385

elixir-explorer/explorer

A fast and elegant data exploration library for Elixir, providing series and dataframes for data science workflows.

+12

+1.0%

1.3K

total stars

Elixir

#386

jblindsay/whitebox-tools

An advanced geospatial data analysis platform for tasks like geomorphology, hydrology, and remote sensing.

+12

+1.1%

1.1K

total stars

Rust

#387

Mrkuhuo/data-warehouse-learning

Open-source data warehouse learning project with examples and code for building real-time and offline data pipelines.

+12

+1.1%

1.1K

total stars

Java

#388

sequelize/sequelize

ORM for Node.js/TypeScript with multiple database support

+11

+0.0%

30.3K

total stars

TypeScript

#389

fivethirtyeight/data

A data repository for the data journalism site FiveThirtyEight, containing data and code behind their articles and graphics.

+11

+0.1%

17.3K

total stars

Jupyter Notebook

#390

rxin/db-readings

This is a collection of readings and resources related to databases, not a vibe coder platform.

+11

+0.1%

8.0K

total stars

#391

orientechnologies/orientdb

OrientDB is a versatile, multi-model DBMS that supports Graph, Document, Reactive, Full-Text, and Geospatial models.

+11

+0.2%

4.9K

total stars

Java

#392

lk-geimfari/mimesis

Mimesis is a fast Python library for generating fake data in multiple languages for testing and development purposes.

+11

+0.2%

4.8K

total stars

Python

#393

indradb/indradb

A Rust-based graph database for developers who need to store and query connected data.

+11

+0.5%

2.4K

total stars

Rust

#394

konradhalas/dacite

A simple Python library for creating dataclasses from dictionaries.

+11

+0.6%

2.0K

total stars

Python

#395

broadinstitute/gatk

Official code repository for the Genome Analysis Toolkit (GATK), a bioinformatics library for working with next-generation DNA sequencing data.

+11

+0.6%

1.9K

total stars

Java

#396

orium/rpds

A Rust library that provides persistent data structures for efficient and immutable data management.

+11

+0.7%

1.7K

total stars

Rust

#397

Hiflylabs/awesome-dbt

A curated list of awesome resources for the data transformation tool dbt, focused on analytics engineering.

+11

+0.7%

1.6K

total stars

#398

babyfish-ct/jimmer

An advanced ORM library for Java and Kotlin developers that provides powerful caching and data management features.

+11

+0.7%

1.6K

total stars

Java

#399

reata/sqllineage

SQL Lineage Analysis Tool that provides data discovery and governance insights through Python.

+11

+0.7%

1.6K

total stars

Python

#400

event-driven-io/Pongo

Pongo is a MongoDB-compatible database that runs on top of PostgreSQL, offering strong consistency benefits.

+11

+0.8%

1.4K

total stars

TypeScript

1...79...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.