Trending Projects

Discover the fastest growing open source projects

Showing 501-550 of 897 trending projects

#501
EliotAndres/kaggle-past-solutions

A searchable compilation of Kaggle past solutions for data science and machine learning developers.

+6
+0.4%
1.5K
total stars
#502
MaxHalford/prince

A Python library for performing multivariate exploratory data analysis, including techniques like PCA, CA, MCA, MFA, and FAMD.

+6
+0.4%
1.4K
total stars
#503
timescale/tsbs

A tool for comparing and evaluating databases for time series data.

+6
+0.4%
1.4K
total stars
#504
igraph/python-igraph

Python interface for the igraph library, a powerful tool for network analysis and visualization.

+6
+0.4%
1.4K
total stars
#505
eralchemy/eralchemy

A Python tool that generates Entity Relationship Diagrams (ERDs) from SQLAlchemy models.

+6
+0.4%
1.4K
total stars
#506
wx-chevalier/Database-Notes

A comprehensive collection of notes and resources for understanding different database technologies and concepts.

+6
+0.4%
1.4K
total stars
#507
spark-examples/pyspark-examples

A collection of PySpark examples covering RDD, DataFrame, and Dataset operations in Python.

+6
+0.5%
1.3K
total stars
#508
pydata/bottleneck

A fast, efficient C extension for NumPy that provides optimized array functions.

+6
+0.5%
1.2K
total stars
#509
pachterlab/gget

gget is a Python library that enables efficient querying of genomic reference databases like NCBI, Ensembl, and UniProt.

+6
+0.6%
1.1K
total stars
#510
caserec/Datasets-for-Recommender-Systems

A high-quality dataset repository for building recommender systems, useful for vibe coders working on AI-powered applications.

+6
+0.6%
1.1K
total stars
#511
ddotta/awesome-polars

A curated list of Polars, an open-source, high-performance data manipulation library for Python and Rust.

+6
+0.6%
1.1K
total stars
#512
markwk/qs_ledger

A personal data aggregator and analysis tool for self-tracking and quantified self enthusiasts.

+6
+0.6%
1.1K
total stars
#513
IQSS/dataverse

Open source research data repository software built with Java.

+6
+0.6%
1.0K
total stars
#514
scylladb/gocqlx

A comprehensive Go library for working with Cassandra/Scylla databases, providing a query builder, ORM, and migration tool.

+6
+0.6%
1.0K
total stars
#515
mysql/mysql-connector-j

MySQL Connector/J is a JDBC driver that enables Java applications to connect to MySQL databases.

+6
+0.6%
1.0K
total stars
#516
opengeos/streamlit-geospatial

A multi-page Streamlit app for geospatial data visualization and analysis, useful for housing and real estate applications.

+6
+0.6%
1.0K
total stars
#517
modin-project/modin

Modin: Scalable Pandas workflows with a single line of code change, enabling distributed data processing.

+5
+0.1%
10.4K
total stars
#518
aarondl/sqlboiler

SQLBoiler is a Go ORM that generates code tailored to your database schema, making it easy to interact with databases.

+5
+0.1%
7.0K
total stars
#519
CLUEbenchmark/CLUEDatasetSearch

A comprehensive search tool for finding Chinese NLP datasets, with support for common English NLP datasets as well.

+5
+0.1%
4.4K
total stars
#520
ankane/groupdate

A Ruby library that makes it easy to group temporal data, useful for developers working with time-series data.

+5
+0.1%
3.9K
total stars
#521
benbjohnson/thesecretlivesofdata

A JavaScript library for visualizing and understanding complex data structures.

+5
+0.1%
3.6K
total stars
#522
fluentmigrator/fluentmigrator

Fluent Migrator is a .NET migration framework for managing database schema changes across multiple database providers.

+5
+0.1%
3.5K
total stars
#523
orbitinghail/sqlsync

Collaborative offline-first SQLite wrapper for syncing app state across users & devices

+5
+0.2%
2.9K
total stars
#524
pydata/numexpr

A fast numerical array expression evaluator for Python, NumPy, Pandas, PyTables and more.

+5
+0.2%
2.4K
total stars
#525
quarylabs/quary

Open-source BI platform for engineers to explore and model large-scale data pipelines.

+5
+0.2%
2.4K
total stars
#526
mysql2sqlite/mysql2sqlite

Converts MySQL database dumps to SQLite3 compatible formats for easier migration and data portability.

+5
+0.3%
2.0K
total stars
#527
alibaba/MongoShake

MongoShake is a universal data replication platform based on MongoDB's oplog, enabling redundant replication and active-active replication.

+5
+0.3%
1.8K
total stars
#528
Werneror/Poetry

This repository provides a comprehensive dataset of over 850,000 Chinese poems from ancient to modern times, making it a valuable resource for developers working with Chinese poetry.

+5
+0.3%
1.7K
total stars
#529
Yimeng-Zhang/feature-engineering-and-feature-selection

A comprehensive guide to feature engineering and feature selection techniques in Python, with examples.

+5
+0.3%
1.6K
total stars
#530
osm2pgsql-dev/osm2pgsql

A C++ library for importing OpenStreetMap data into a PostgreSQL/PostGIS database.

+5
+0.3%
1.6K
total stars
#531
dineug/erd-editor

An open-source, TypeScript-based Entity-Relationship Diagram (ERD) editor for developers working with databases.

+5
+0.3%
1.6K
total stars
#532
xitongsys/parquet-go

A pure Go library for reading and writing Parquet files, a columnar data format.

+5
+0.3%
1.4K
total stars
#533
toluaina/pgsync

A Python library that syncs data from Postgres to Elasticsearch/OpenSearch, enabling real-time data pipelines.

+5
+0.4%
1.4K
total stars
#534
LuxCoreRender/LuxCore

LuxCore is a high-performance path-tracing render engine for realistic 3D graphics and visualization.

+5
+0.4%
1.3K
total stars
#535
orlp/slotmap

A Rust data structure for efficiently storing and accessing data in a sparse set.

+5
+0.4%
1.3K
total stars
#536
nicodv/kmodes

Python library for clustering categorical data using k-modes and k-prototypes algorithms.

+5
+0.4%
1.3K
total stars
#537
TeoMeWhy/teomerefs

A comprehensive guide to technical references for data careers, including Python, machine learning, and data science.

+5
+0.4%
1.3K
total stars
#538
Toblerity/Fiona

Fiona is a Python library for reading and writing geographic data files, with support for CLI usage.

+5
+0.4%
1.2K
total stars
#539
citusdata/postgresql-hll

A PostgreSQL extension that adds HyperLogLog data structures as a native data type.

+5
+0.4%
1.2K
total stars
#540
calogica/dbt-expectations

A port of Great Expectations to dbt test macros for data testing and validation in data engineering workflows.

+5
+0.4%
1.2K
total stars
#541
pytroll/satpy

A Python package for processing earth-observing satellite data with support for common data formats and tools.

+5
+0.4%
1.2K
total stars
#542
dataquestio/project-walkthroughs

A collection of data science, machine learning, and web development project code for Dataquest's YouTube channel.

+5
+0.5%
1.1K
total stars
#543
mpmath/mpmath

A Python library for arbitrary-precision floating-point arithmetic, providing advanced numerical capabilities.

+5
+0.5%
1.1K
total stars
#544
kblin/ncbi-genome-download

Scripts to download genomes from the NCBI FTP servers for bioinformatics and genomics research.

+5
+0.5%
1.1K
total stars
#545
inloop/sqlite-viewer

A simple SQLite file viewer that allows you to view and explore SQLite databases online.

+5
+0.5%
1.0K
total stars
#546
Kotlin/dataframe

A Kotlin library for structured data processing, suitable for data analysis and data science tasks.

+5
+0.5%
1.0K
total stars
#547
tidyverse/readr

A fast and flexible R package for reading flat files (CSV, TSV, fixed-width) into R data frames.

+5
+0.5%
1.0K
total stars
#548
cyang-kth/fmm

An open-source C++ framework for fast and parallel map matching of GPS trajectories.

+5
+0.5%
1.0K
total stars
#549
1eez/103976

A comprehensive English word database with translations, parts of speech, and definitions for developers.

+5
+0.5%
1.0K
total stars
#550
alibaba/canal

MySQL binlog incremental subscription and consumption component

+4
+0.0%
29.6K
total stars
1...1012...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.