Trending Projects

Discover the fastest growing open source projects

Showing 351-400 of 897 trending projects

#351
olric-data/olric

Olric is a distributed, in-memory key/value store and cache for Go applications and services.

+13
+0.4%
3.4K
total stars
#352
CamDavidsonPilon/lifelines

A Python library for survival analysis, useful for developers working with time-to-event data.

+13
+0.5%
2.6K
total stars
#353
san089/goodreads_etl_pipeline

An end-to-end data pipeline for building a data lake, data warehouse, and analytics platform from GoodReads data.

+13
+0.9%
1.5K
total stars
#354
quantopian/empyrical

A Python library that provides common financial risk and performance metrics used in financial analysis.

+13
+0.9%
1.5K
total stars
#355
DrTimothyAldenDavis/SuiteSparse

A powerful suite of sparse matrix algorithms and libraries for scientific and numerical computing.

+13
+0.9%
1.5K
total stars
#356
LongOnly/Quantitative-Notebooks

Educational notebooks on quantitative finance, algorithmic trading, financial modeling, and investment strategy.

+13
+1.0%
1.3K
total stars
#357
submato/xhscrawl

A web scraping tool for collecting data from Xiaohongshu, Bilibili, and other Chinese social platforms.

+13
+1.1%
1.3K
total stars
#358
opengeos/Awesome-GEE

A curated list of Google Earth Engine resources for geospatial analysis and remote sensing applications.

+13
+1.1%
1.2K
total stars
#359
RUCAIBox/RecSysDatasets

A repository of public data sources for building and testing recommender systems.

+13
+1.1%
1.2K
total stars
#360
zinggAI/zingg

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

+13
+1.1%
1.2K
total stars
#361
moshi4/pyCirclize

A Python library for creating circular data visualizations like Circos plots, chord diagrams, and radar charts.

+13
+1.3%
1.1K
total stars
#362
google/cluster-data

This is a dataset of Borg cluster traces from Google, which can be useful for researchers and developers in the field of distributed systems and cloud infrastructure.

+13
+1.3%
1.0K
total stars
#363
OHDSI/CommonDataModel

A definition and DDLs for the OMOP Common Data Model (CDM), a data model for healthcare data.

+13
+1.3%
1.0K
total stars
#364
allenai/s2orc

A large-scale open-access corpus of scientific papers and metadata for researchers and developers.

+13
+1.3%
1.0K
total stars
#365
typicode/lowdb

Lightweight local JSON database for JavaScript/TypeScript apps

+12
+0.1%
22.5K
total stars
#366
heibaiying/BigData-Notes

A comprehensive guide to big data technologies like Hadoop, Spark, Kafka, and more for developers.

+12
+0.1%
16.9K
total stars
#367
jupyter/docker-stacks

Docker images containing Jupyter applications for data science and machine learning workflows.

+12
+0.1%
8.4K
total stars
#368
qinwf/awesome-R

A curated list of awesome R packages, frameworks and software for data analysis and data science.

+12
+0.2%
6.4K
total stars
#369
dunwu/db-tutorial

An in-depth tutorial covering mainstream database knowledge for backend developers.

+12
+0.2%
5.3K
total stars
#370
alandefreitas/matplotplusplus

Matplot++: A C++ graphics library for creating high-quality data visualizations and scientific plots.

+12
+0.3%
4.8K
total stars
#371
GoogleTrends/data

An open-source index of Google Trends data, useful for developers building data-driven applications.

+12
+0.3%
4.8K
total stars
#372
jitsucom/jitsu

Open-source data pipeline engine for real-time ETL, connecting data sources to warehouses like BigQuery, Snowflake, Redshift.

+12
+0.3%
4.7K
total stars
#373
first20hours/google-10000-english

This repo contains a list of the 10,000 most common English words, useful for NLP and language modeling tasks.

+12
+0.3%
4.3K
total stars
#374
canonical/dqlite

An embeddable, replicated, and fault-tolerant SQL engine for building robust and scalable applications.

+12
+0.3%
4.3K
total stars
#375
ApsaraDB/PolarDB-for-PostgreSQL

A cloud-native PostgreSQL database developed by Alibaba Cloud for high-performance, scalable data storage and management.

+12
+0.4%
3.1K
total stars
#376
dolthub/go-mysql-server

A MySQL-compatible relational database with a storage agnostic query engine, implemented in Go.

+12
+0.5%
2.6K
total stars
#377
mdeff/fma

A dataset for music analysis and research, with support for deep learning and reproducible research.

+12
+0.5%
2.6K
total stars
#378
GanjinZero/awesome_Chinese_medical_NLP

A curated collection of open-source Chinese medical NLP resources including datasets, models, and more.

+12
+0.5%
2.5K
total stars
#379
malloydata/malloy

Malloy is an open-source language for describing data relationships and transformations.

+12
+0.5%
2.4K
total stars
#380
mwaskom/seaborn-data

This is a data repository for the Seaborn data visualization library in Python.

+12
+0.7%
1.8K
total stars
#381
zalando/spilo

Highly available PostgreSQL cluster using Docker, focused on data infrastructure for developers.

+12
+0.7%
1.8K
total stars
#382
polarsignals/frostdb

A fast, embeddable column database written in Go, optimized for AI/ML workloads.

+12
+0.8%
1.5K
total stars
#383
google/tensorstore

A C++ library for reading and writing large multi-dimensional arrays, useful for scientific and data-intensive applications.

+12
+0.8%
1.5K
total stars
#384
percona/percona-toolkit

Percona Toolkit is a collection of advanced open source database tools for MySQL, MongoDB, and PostgreSQL.

+12
+0.8%
1.5K
total stars
#385
elixir-explorer/explorer

A fast and elegant data exploration library for Elixir, providing series and dataframes for data science workflows.

+12
+1.0%
1.3K
total stars
#386
jblindsay/whitebox-tools

An advanced geospatial data analysis platform for tasks like geomorphology, hydrology, and remote sensing.

+12
+1.1%
1.1K
total stars
#387
Mrkuhuo/data-warehouse-learning

Open-source data warehouse learning project with examples and code for building real-time and offline data pipelines.

+12
+1.1%
1.1K
total stars
#388
sequelize/sequelize

ORM for Node.js/TypeScript with multiple database support

+11
+0.0%
30.3K
total stars
#389
fivethirtyeight/data

A data repository for the data journalism site FiveThirtyEight, containing data and code behind their articles and graphics.

+11
+0.1%
17.3K
total stars
#390
rxin/db-readings

This is a collection of readings and resources related to databases, not a vibe coder platform.

+11
+0.1%
8.0K
total stars
#391
orientechnologies/orientdb

OrientDB is a versatile, multi-model DBMS that supports Graph, Document, Reactive, Full-Text, and Geospatial models.

+11
+0.2%
4.9K
total stars
#392
lk-geimfari/mimesis

Mimesis is a fast Python library for generating fake data in multiple languages for testing and development purposes.

+11
+0.2%
4.8K
total stars
#393
indradb/indradb

A Rust-based graph database for developers who need to store and query connected data.

+11
+0.5%
2.4K
total stars
#394
konradhalas/dacite

A simple Python library for creating dataclasses from dictionaries.

+11
+0.6%
2.0K
total stars
#395
broadinstitute/gatk

Official code repository for the Genome Analysis Toolkit (GATK), a bioinformatics library for working with next-generation DNA sequencing data.

+11
+0.6%
1.9K
total stars
#396
orium/rpds

A Rust library that provides persistent data structures for efficient and immutable data management.

+11
+0.7%
1.7K
total stars
#397
Hiflylabs/awesome-dbt

A curated list of awesome resources for the data transformation tool dbt, focused on analytics engineering.

+11
+0.7%
1.6K
total stars
#398
babyfish-ct/jimmer

An advanced ORM library for Java and Kotlin developers that provides powerful caching and data management features.

+11
+0.7%
1.6K
total stars
#399
reata/sqllineage

SQL Lineage Analysis Tool that provides data discovery and governance insights through Python.

+11
+0.7%
1.6K
total stars
#400
event-driven-io/Pongo

Pongo is a MongoDB-compatible database that runs on top of PostgreSQL, offering strong consistency benefits.

+11
+0.8%
1.4K
total stars
1...79...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.