Trending Projects

Discover the fastest growing open source projects

Showing 451-500 of 897 trending projects

#451

rogersce/cnpy

A C++ library for reading and writing .npy and .npz files, commonly used in scientific computing.

+0.6%

1.5K

total stars

C++

#452

XTXMarkets/ternfs

An exabyte-scale, multi-region distributed file system for developers building AI-powered applications.

+0.6%

1.3K

total stars

C++

#453

paulvangentcom/heartrate_analysis_python

A Python package for analyzing heart rate data from PPG and ECG signals.

+0.7%

1.1K

total stars

Python

#454

openspout/openspout

A fast and scalable library for reading and writing spreadsheet files (CSV, XLSX, ODS) in PHP.

+0.7%

1.1K

total stars

PHP

#455

shaypal5/awesome-twitter-data

A curated list of Twitter datasets and resources for data scientists and social network analysts.

+0.7%

1.1K

total stars

#456

cayleygraph/cayley

An open-source graph database written in Go, useful for building applications that require linked data and graph-based queries.

+0.1%

15.0K

total stars

#457

datastacktv/data-engineer-roadmap

This is a roadmap for becoming a data engineer, not a developer discovery platform for vibe coders.

+0.1%

12.7K

total stars

#458

dedupeio/dedupe

A Python library for accurate and scalable fuzzy matching, record deduplication, and entity resolution.

+0.2%

4.4K

total stars

Python

#459

upper/db

A data access layer (DAL) and ORM-like library for working with SQL and NoSQL databases in Go.

+0.2%

3.6K

total stars

#460

ClickHouse/clickhouse-go

A Go driver for the ClickHouse analytics database, enabling fast and efficient data processing.

+0.2%

3.3K

total stars

#461

tirthajyoti/Data-science-best-resources

A curated collection of resources for data science and machine learning enthusiasts.

+0.2%

3.2K

total stars

#462

aditya-grover/node2vec

This Scala library provides a high-performance implementation of the node2vec algorithm for embedding graphs.

+0.3%

2.7K

total stars

Scala

#463

bytewax/bytewax

Bytewax is a Python library for building scalable, fault-tolerant, and low-latency data processing pipelines.

+0.4%

2.0K

total stars

Python

#464

ContextLab/hypertools

A Python toolbox for gaining geometric insights into high-dimensional data, useful for vibe coders working with AI tools.

+0.4%

1.9K

total stars

Python

#465

risinglightdb/risinglight

An educational OLAP database system built in Rust for learning and experimentation.

+0.4%

1.8K

total stars

Rust

#466

npgsql/efcore.pg

Entity Framework Core provider for PostgreSQL, enabling .NET developers to easily interact with PostgreSQL databases.

+0.4%

1.8K

total stars

#467

mourner/flatbush

A fast spatial index library for 2D points and rectangles in JavaScript, useful for geospatial applications.

+0.5%

1.6K

total stars

JavaScript

#468

paradigmxyz/cryo

cryo is a Rust library for extracting blockchain data to parquet, CSV, JSON, or Python dataframes.

+0.5%

1.5K

total stars

Rust

#469

XD-DENG/SQL-exercise

A collection of SQL practice problems for developers to improve their SQL skills.

+0.5%

1.5K

total stars

#470

duneanalytics/spellbook

A Python library providing SQL views for Dune Analytics, a popular blockchain data analysis platform.

+0.5%

1.5K

total stars

Python

#471

NiuTrans/Classical-Modern

A parallel corpus of classical Chinese and modern Chinese texts for language processing and analysis.

+0.5%

1.4K

total stars

Python

#472

paul-buerkner/brms

R package for Bayesian generalized multivariate non-linear multilevel models using Stan

+0.5%

1.4K

total stars

#473

damklis/DataEngineeringProject

An end-to-end data engineering project example showcasing tools and technologies for building data pipelines.

+0.5%

1.4K

total stars

Python

#474

opendatadiscovery/odd-platform

First open-source data discovery and observability platform for data practitioners.

+0.5%

1.4K

total stars

Java

#475

databricks/LearningSparkV2

This is a book that teaches how to use Apache Spark for lightning-fast data analytics.

+0.5%

1.4K

total stars

Scala

#476

crazyhottommy/getting-started-with-genomics-tools-and-resources

A collection of Unix, R, and Python tools for bioinformatics and data science projects.

+0.5%

1.4K

total stars

Shell

#477

avinassh/py-caskdb

An educational project to build a disk-based key-value store in Python for learning purposes.

+0.5%

1.4K

total stars

Python

#478

eleanorlutz/asteroids_atlas_of_space

This is an astronomy visualization project that maps orbits of asteroids in the solar system.

+0.5%

1.3K

total stars

Jupyter Notebook

#479

s3ql/s3ql

A full-featured file system for online data storage, built with Python.

+0.6%

1.2K

total stars

Python

#480

andrewgbruce/statistics-for-data-scientists

This repository provides code and data for a book on statistics for data scientists.

+0.6%

1.2K

total stars

#481

PoloDB/PoloDB

PoloDB is an embedded document database written in Rust for building cross-platform, local-first applications.

+0.6%

1.2K

total stars

Rust

#482

apache/amoro

Apache Amoro is an open-source Lakehouse management system built on big data formats like Flink, Hudi, and Iceberg.

+0.6%

1.1K

total stars

Java

#483

beamandrew/medical-data

No description provided for this medical data repository.

+0.1%

6.0K

total stars

#484

DotNetNext/SqlSugar

A powerful, multi-database ORM for .NET that supports a wide range of SQL databases and provides a seamless data access layer.

+0.1%

5.8K

total stars

#485

sqlkata/querybuilder

SQL query builder for C# developers, supporting multiple databases and complex queries.

+0.2%

3.3K

total stars

#486

caj2pdf/caj2pdf

A Python tool to convert CAJ (China Academic Journals) files to PDF for developers who work with academic literature.

+0.2%

3.2K

total stars

Python

#487

uiwjs/province-city-china

Comprehensive dataset of China's administrative divisions (province, city, county, town) in JSON, CSV, and SQL formats.

+0.2%

3.0K

total stars

JavaScript

#488

gonum/plot

A Go library for creating high-quality plots and visualizations of data

+0.2%

2.9K

total stars

#489

sfikas/medical-imaging-datasets

A collection of medical imaging datasets for researchers and developers in the healthcare industry.

+0.2%

2.5K

total stars

#490

geekyouth/SZT-bigdata

This is a big data analysis system for the Shenzhen metro with support for various data processing tools.

+0.3%

2.4K

total stars

Scala

#491

benedekrozemberczki/awesome-community-detection

A curated list of community detection research papers with implementations for data science and network analysis.

+0.3%

2.4K

total stars

Python

#492

brimdata/zui

Zui is a powerful desktop app for exploring and working with data, with support for CSV, JSON, and the Zed data format.

+0.3%

1.9K

total stars

TypeScript

#493

mirage/irmin

Irmin is a distributed database that follows the same design principles as Git, allowing for distributed version control of data.

+0.3%

1.9K

total stars

OCaml

#494

fluid-cloudnative/fluid

Fluid is a distributed data abstraction and acceleration framework for Big Data and AI applications on the cloud.

+0.3%

1.9K

total stars

#495

raphaelvallat/pingouin

A Python statistical package based on Pandas, providing various statistical methods and tests.

+0.3%

1.9K

total stars

Python

#496

mkazhdan/PoissonRecon

Poisson Surface Reconstruction is a C++ library for reconstructing surfaces from point cloud data.

+0.3%

1.8K

total stars

C++

#497

TuGraph-family/tugraph-db

TuGraph-DB is a high-performance graph database built for fast and efficient graph data processing.

+0.3%

1.7K

total stars

C++

#498

vaastav/Fantasy-Premier-League

A Python script that generates a CSV file with data about players in the English Premier League Fantasy League.

+0.4%

1.7K

total stars

Python

#499

imageio/imageio

A Python library for reading and writing a wide range of image and video formats, including DICOM, animated GIFs, and webcam capture.

+0.4%

1.7K

total stars

Python

#500

capitalone/DataProfiler

A Python library for extracting schema, statistics, and entities from datasets, useful for data profiling and privacy analysis.

+0.4%

1.5K

total stars

Python

1...911...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.