Trending Projects

Discover the fastest growing open source projects

Showing 501-550 of 897 trending projects

#501
mwaskom/seaborn-data

This is a data repository for the Seaborn data visualization library in Python.

+132
+7.8%
1.8K
total stars
#502
mozilla/mentat

A persistent, relational store inspired by Datomic and DataScript, written in Rust.

+132
+8.7%
1.7K
total stars
#503
google/tensorstore

A C++ library for reading and writing large multi-dimensional arrays, useful for scientific and data-intensive applications.

+132
+9.7%
1.5K
total stars
#504
tonsky/datascript

Immutable database and Datalog query engine for Clojure, ClojureScript and JS

+130
+2.3%
5.7K
total stars
#505
cloudera/hue

Open source SQL query assistant service for databases and data warehouses

+130
+9.9%
1.4K
total stars
#506
lijin-THU/notes-python

A comprehensive set of Python notes and resources for developers, covering a wide range of topics including data science, machine learning, and scientific computing.

+129
+1.9%
7.1K
total stars
#507
reata/sqllineage

SQL Lineage Analysis Tool that provides data discovery and governance insights through Python.

+129
+8.6%
1.6K
total stars
#508
hermitdave/FrequencyWords

A frequency word list generator and processed files for text analysis and natural language processing.

+129
+9.7%
1.5K
total stars
#509
crate/crate

CrateDB is a distributed, scalable SQL database for storing and analyzing massive amounts of data in near real-time.

+128
+3.0%
4.4K
total stars
#510
samapriya/awesome-gee-community-datasets

A community-driven catalog of geospatial datasets for use with Google Earth Engine.

+127
+13.1%
1.1K
total stars
#511
konradhalas/dacite

A simple Python library for creating dataclasses from dictionaries.

+126
+6.7%
2.0K
total stars
#512
oetiker/rrdtool-1.x

RRDtool is a time-series database system for efficiently storing and graphing data.

+126
+13.2%
1.1K
total stars
#513
big-data-europe/docker-hive

This is a Docker container for running Apache Hive, a data warehousing tool for big data analysis.

+126
+13.2%
1.1K
total stars
#514
sfikas/medical-imaging-datasets

A collection of medical imaging datasets for researchers and developers in the healthcare industry.

+125
+5.2%
2.5K
total stars
#515
gaarason/database-all

Eloquent ORM for Java 8, 11, 17, 21, 23 and Spring boot 2.x, 3.x

+124
+12.9%
1.1K
total stars
#516
vaexio/vaex

A high-performance Python library for working with large tabular datasets, offering efficient data manipulation and visualization.

+123
+1.5%
8.5K
total stars
#517
apache/zeppelin

Zeppelin is a web-based notebook that enables data-driven, interactive data analytics and collaborative documents.

+123
+1.9%
6.6K
total stars
#518
datavane/tis

A Java-based framework for building agile DataOps pipelines using tools like Flink, DataX, and Chunjun with a web UI.

+123
+10.6%
1.3K
total stars
#519
jackzhenguo/python-small-examples

A collection of Python code examples and tutorials for data science, machine learning, and web development.

+122
+1.5%
8.1K
total stars
#520
vaastav/Fantasy-Premier-League

A Python script that generates a CSV file with data about players in the English Premier League Fantasy League.

+122
+7.8%
1.7K
total stars
#521
XD-DENG/SQL-exercise

A collection of SQL practice problems for developers to improve their SQL skills.

+122
+9.0%
1.5K
total stars
#522
bashtage/linearmodels

This Python library provides additional linear models for statistical modeling and analysis.

+122
+13.4%
1.0K
total stars
#523
openaddresses/openaddresses

An open-source global repository of address, building, and parcel data for developers and geospatial applications.

+121
+4.0%
3.1K
total stars
#524
apache/incubator-xtable

Apache XTable is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.

+120
+11.5%
1.2K
total stars
#525
zinggAI/zingg

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

+120
+11.6%
1.2K
total stars
#526
indradb/indradb

A Rust-based graph database for developers who need to store and query connected data.

+119
+5.2%
2.4K
total stars
#527
openspout/openspout

A fast and scalable library for reading and writing spreadsheet files (CSV, XLSX, ODS) in PHP.

+119
+12.2%
1.1K
total stars
#528
cayleygraph/cayley

An open-source graph database written in Go, useful for building applications that require linked data and graph-based queries.

+118
+0.8%
15.0K
total stars
#529
Giorgi/EntityFramework.Exceptions

A .NET Standard library that provides strongly typed exceptions for Entity Framework Core across multiple database providers.

+118
+7.4%
1.7K
total stars
#530
nakabonne/tstorage

An embedded time-series database written in Go for storing and querying metrics data.

+118
+10.6%
1.2K
total stars
#531
Werneror/Poetry

This repository provides a comprehensive dataset of over 850,000 Chinese poems from ancient to modern times, making it a valuable resource for developers working with Chinese poetry.

+117
+7.3%
1.7K
total stars
#532
hi-primus/optimus

Agile data preparation workflows made easy with popular Python data science libraries.

+117
+8.2%
1.5K
total stars
#533
lmmentel/awesome-python-chemistry

A curated list of Python packages for chemistry, including computational chemistry, molecular dynamics, and quantum chemistry.

+117
+9.4%
1.4K
total stars
#534
infostreams/db

A command-line tool for version controlling database snapshots, allowing developers to save, restore, and archive database state.

+117
+9.9%
1.3K
total stars
#535
orientechnologies/orientdb

OrientDB is a versatile, multi-model DBMS that supports Graph, Document, Reactive, Full-Text, and Geospatial models.

+116
+2.4%
4.9K
total stars
#536
tirthajyoti/Data-science-best-resources

A curated collection of resources for data science and machine learning enthusiasts.

+116
+3.8%
3.2K
total stars
#537
pydata/pandas-datareader

A Python library for extracting data from a wide range of internet sources into a pandas DataFrame.

+116
+3.8%
3.2K
total stars
#538
marcboeker/gmail-to-sqlite

Index your Gmail account to a SQLite DB and perform custom data analysis on your email.

+116
+10.5%
1.2K
total stars
#539
CSSEGISandData/COVID-19

Real-time global and U.S. data tracking for developers and researchers.

+115
+0.4%
29.0K
total stars
#540
RoaringBitmap/CRoaring

Optimized Roaring bitmaps in C and C++ with SIMD (AVX2, AVX-512, NEON) for high-performance data processing.

+115
+6.9%
1.8K
total stars
#541
robjhyndman/forecast

A time series forecasting library for R, providing a wide range of models and tools for accurate predictions.

+114
+10.9%
1.2K
total stars
#542
zalando/spilo

Highly available PostgreSQL cluster using Docker, focused on data infrastructure for developers.

+113
+6.7%
1.8K
total stars
#543
PDAL/PDAL

PDAL is a C++ library for processing point cloud data, similar to GDAL for raster data.

+113
+9.2%
1.3K
total stars
#544
PoloDB/PoloDB

PoloDB is an embedded document database written in Rust for building cross-platform, local-first applications.

+113
+10.6%
1.2K
total stars
#545
Wisser/Jailer

A Java-based database subsetting and relational data browsing tool for popular databases.

+112
+3.7%
3.1K
total stars
#546
toluaina/pgsync

A Python library that syncs data from Postgres to Elasticsearch/OpenSearch, enabling real-time data pipelines.

+112
+8.8%
1.4K
total stars
#547
rob-med/awesome-TS-anomaly-detection

A curated list of tools and datasets for anomaly detection on time-series data.

+111
+3.6%
3.2K
total stars
#548
fluid-cloudnative/fluid

Fluid is a distributed data abstraction and acceleration framework for Big Data and AI applications on the cloud.

+111
+6.2%
1.9K
total stars
#549
awslabs/open-data-registry

A registry of publicly available datasets hosted on AWS for data-driven developers.

+111
+7.2%
1.7K
total stars
#550
Awesome-Image-Registration-Organization/awesome-image-registration

A curated collection of resources related to image registration, including books, papers, videos, and toolboxes.

+111
+8.0%
1.5K
total stars
1...1012...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.