Trending Projects

Discover the fastest growing open source projects

Showing 451-500 of 897 trending projects

#451
rogersce/cnpy

A C++ library for reading and writing .npy and .npz files, commonly used in scientific computing.

+8
+0.6%
1.5K
total stars
#452
XTXMarkets/ternfs

An exabyte-scale, multi-region distributed file system for developers building AI-powered applications.

+8
+0.6%
1.3K
total stars
#453
paulvangentcom/heartrate_analysis_python

A Python package for analyzing heart rate data from PPG and ECG signals.

+8
+0.7%
1.1K
total stars
#454
openspout/openspout

A fast and scalable library for reading and writing spreadsheet files (CSV, XLSX, ODS) in PHP.

+8
+0.7%
1.1K
total stars
#455
shaypal5/awesome-twitter-data

A curated list of Twitter datasets and resources for data scientists and social network analysts.

+8
+0.7%
1.1K
total stars
#456
cayleygraph/cayley

An open-source graph database written in Go, useful for building applications that require linked data and graph-based queries.

+7
+0.1%
15.0K
total stars
#457
datastacktv/data-engineer-roadmap

This is a roadmap for becoming a data engineer, not a developer discovery platform for vibe coders.

+7
+0.1%
12.7K
total stars
#458
dedupeio/dedupe

A Python library for accurate and scalable fuzzy matching, record deduplication, and entity resolution.

+7
+0.2%
4.4K
total stars
#459
upper/db

A data access layer (DAL) and ORM-like library for working with SQL and NoSQL databases in Go.

+7
+0.2%
3.6K
total stars
#460
ClickHouse/clickhouse-go

A Go driver for the ClickHouse analytics database, enabling fast and efficient data processing.

+7
+0.2%
3.3K
total stars
#461
tirthajyoti/Data-science-best-resources

A curated collection of resources for data science and machine learning enthusiasts.

+7
+0.2%
3.2K
total stars
#462
aditya-grover/node2vec

This Scala library provides a high-performance implementation of the node2vec algorithm for embedding graphs.

+7
+0.3%
2.7K
total stars
#463
bytewax/bytewax

Bytewax is a Python library for building scalable, fault-tolerant, and low-latency data processing pipelines.

+7
+0.4%
2.0K
total stars
#464
ContextLab/hypertools

A Python toolbox for gaining geometric insights into high-dimensional data, useful for vibe coders working with AI tools.

+7
+0.4%
1.9K
total stars
#465
risinglightdb/risinglight

An educational OLAP database system built in Rust for learning and experimentation.

+7
+0.4%
1.8K
total stars
#466
npgsql/efcore.pg

Entity Framework Core provider for PostgreSQL, enabling .NET developers to easily interact with PostgreSQL databases.

+7
+0.4%
1.8K
total stars
#467
mourner/flatbush

A fast spatial index library for 2D points and rectangles in JavaScript, useful for geospatial applications.

+7
+0.5%
1.6K
total stars
#468
paradigmxyz/cryo

cryo is a Rust library for extracting blockchain data to parquet, CSV, JSON, or Python dataframes.

+7
+0.5%
1.5K
total stars
#469
XD-DENG/SQL-exercise

A collection of SQL practice problems for developers to improve their SQL skills.

+7
+0.5%
1.5K
total stars
#470
duneanalytics/spellbook

A Python library providing SQL views for Dune Analytics, a popular blockchain data analysis platform.

+7
+0.5%
1.5K
total stars
#471
NiuTrans/Classical-Modern

A parallel corpus of classical Chinese and modern Chinese texts for language processing and analysis.

+7
+0.5%
1.4K
total stars
#472
paul-buerkner/brms

R package for Bayesian generalized multivariate non-linear multilevel models using Stan

+7
+0.5%
1.4K
total stars
#473
damklis/DataEngineeringProject

An end-to-end data engineering project example showcasing tools and technologies for building data pipelines.

+7
+0.5%
1.4K
total stars
#474
opendatadiscovery/odd-platform

First open-source data discovery and observability platform for data practitioners.

+7
+0.5%
1.4K
total stars
#475
databricks/LearningSparkV2

This is a book that teaches how to use Apache Spark for lightning-fast data analytics.

+7
+0.5%
1.4K
total stars
#476
crazyhottommy/getting-started-with-genomics-tools-and-resources

A collection of Unix, R, and Python tools for bioinformatics and data science projects.

+7
+0.5%
1.4K
total stars
#477
avinassh/py-caskdb

An educational project to build a disk-based key-value store in Python for learning purposes.

+7
+0.5%
1.4K
total stars
#478
eleanorlutz/asteroids_atlas_of_space

This is an astronomy visualization project that maps orbits of asteroids in the solar system.

+7
+0.5%
1.3K
total stars
#479
s3ql/s3ql

A full-featured file system for online data storage, built with Python.

+7
+0.6%
1.2K
total stars
#480
andrewgbruce/statistics-for-data-scientists

This repository provides code and data for a book on statistics for data scientists.

+7
+0.6%
1.2K
total stars
#481
PoloDB/PoloDB

PoloDB is an embedded document database written in Rust for building cross-platform, local-first applications.

+7
+0.6%
1.2K
total stars
#482
apache/amoro

Apache Amoro is an open-source Lakehouse management system built on big data formats like Flink, Hudi, and Iceberg.

+7
+0.6%
1.1K
total stars
#483
beamandrew/medical-data

No description provided for this medical data repository.

+6
+0.1%
6.0K
total stars
#484
DotNetNext/SqlSugar

A powerful, multi-database ORM for .NET that supports a wide range of SQL databases and provides a seamless data access layer.

+6
+0.1%
5.8K
total stars
#485
sqlkata/querybuilder

SQL query builder for C# developers, supporting multiple databases and complex queries.

+6
+0.2%
3.3K
total stars
#486
caj2pdf/caj2pdf

A Python tool to convert CAJ (China Academic Journals) files to PDF for developers who work with academic literature.

+6
+0.2%
3.2K
total stars
#487
uiwjs/province-city-china

Comprehensive dataset of China's administrative divisions (province, city, county, town) in JSON, CSV, and SQL formats.

+6
+0.2%
3.0K
total stars
#488
gonum/plot

A Go library for creating high-quality plots and visualizations of data

+6
+0.2%
2.9K
total stars
#489
sfikas/medical-imaging-datasets

A collection of medical imaging datasets for researchers and developers in the healthcare industry.

+6
+0.2%
2.5K
total stars
#490
geekyouth/SZT-bigdata

This is a big data analysis system for the Shenzhen metro with support for various data processing tools.

+6
+0.3%
2.4K
total stars
#491
benedekrozemberczki/awesome-community-detection

A curated list of community detection research papers with implementations for data science and network analysis.

+6
+0.3%
2.4K
total stars
#492
brimdata/zui

Zui is a powerful desktop app for exploring and working with data, with support for CSV, JSON, and the Zed data format.

+6
+0.3%
1.9K
total stars
#493
mirage/irmin

Irmin is a distributed database that follows the same design principles as Git, allowing for distributed version control of data.

+6
+0.3%
1.9K
total stars
#494
fluid-cloudnative/fluid

Fluid is a distributed data abstraction and acceleration framework for Big Data and AI applications on the cloud.

+6
+0.3%
1.9K
total stars
#495
raphaelvallat/pingouin

A Python statistical package based on Pandas, providing various statistical methods and tests.

+6
+0.3%
1.9K
total stars
#496
mkazhdan/PoissonRecon

Poisson Surface Reconstruction is a C++ library for reconstructing surfaces from point cloud data.

+6
+0.3%
1.8K
total stars
#497
TuGraph-family/tugraph-db

TuGraph-DB is a high-performance graph database built for fast and efficient graph data processing.

+6
+0.3%
1.7K
total stars
#498
vaastav/Fantasy-Premier-League

A Python script that generates a CSV file with data about players in the English Premier League Fantasy League.

+6
+0.4%
1.7K
total stars
#499
imageio/imageio

A Python library for reading and writing a wide range of image and video formats, including DICOM, animated GIFs, and webcam capture.

+6
+0.4%
1.7K
total stars
#500
capitalone/DataProfiler

A Python library for extracting schema, statistics, and entities from datasets, useful for data profiling and privacy analysis.

+6
+0.4%
1.5K
total stars
1...911...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.