Category
Showing 501-550 of 897 trending projects
A searchable compilation of Kaggle past solutions for data science and machine learning developers.
A Python library for performing multivariate exploratory data analysis, including techniques like PCA, CA, MCA, MFA, and FAMD.
A tool for comparing and evaluating databases for time series data.
Python interface for the igraph library, a powerful tool for network analysis and visualization.
A Python tool that generates Entity Relationship Diagrams (ERDs) from SQLAlchemy models.
A comprehensive collection of notes and resources for understanding different database technologies and concepts.
A collection of PySpark examples covering RDD, DataFrame, and Dataset operations in Python.
A fast, efficient C extension for NumPy that provides optimized array functions.
gget is a Python library that enables efficient querying of genomic reference databases like NCBI, Ensembl, and UniProt.
A high-quality dataset repository for building recommender systems, useful for vibe coders working on AI-powered applications.
A curated list of Polars, an open-source, high-performance data manipulation library for Python and Rust.
A personal data aggregator and analysis tool for self-tracking and quantified self enthusiasts.
Open source research data repository software built with Java.
A comprehensive Go library for working with Cassandra/Scylla databases, providing a query builder, ORM, and migration tool.
MySQL Connector/J is a JDBC driver that enables Java applications to connect to MySQL databases.
A multi-page Streamlit app for geospatial data visualization and analysis, useful for housing and real estate applications.
Modin: Scalable Pandas workflows with a single line of code change, enabling distributed data processing.
SQLBoiler is a Go ORM that generates code tailored to your database schema, making it easy to interact with databases.
A comprehensive search tool for finding Chinese NLP datasets, with support for common English NLP datasets as well.
A Ruby library that makes it easy to group temporal data, useful for developers working with time-series data.
A JavaScript library for visualizing and understanding complex data structures.
Fluent Migrator is a .NET migration framework for managing database schema changes across multiple database providers.
Collaborative offline-first SQLite wrapper for syncing app state across users & devices
A fast numerical array expression evaluator for Python, NumPy, Pandas, PyTables and more.
Open-source BI platform for engineers to explore and model large-scale data pipelines.
Converts MySQL database dumps to SQLite3 compatible formats for easier migration and data portability.
MongoShake is a universal data replication platform based on MongoDB's oplog, enabling redundant replication and active-active replication.
This repository provides a comprehensive dataset of over 850,000 Chinese poems from ancient to modern times, making it a valuable resource for developers working with Chinese poetry.
A comprehensive guide to feature engineering and feature selection techniques in Python, with examples.
A C++ library for importing OpenStreetMap data into a PostgreSQL/PostGIS database.
An open-source, TypeScript-based Entity-Relationship Diagram (ERD) editor for developers working with databases.
A pure Go library for reading and writing Parquet files, a columnar data format.
A Python library that syncs data from Postgres to Elasticsearch/OpenSearch, enabling real-time data pipelines.
LuxCore is a high-performance path-tracing render engine for realistic 3D graphics and visualization.
A Rust data structure for efficiently storing and accessing data in a sparse set.
Python library for clustering categorical data using k-modes and k-prototypes algorithms.
A comprehensive guide to technical references for data careers, including Python, machine learning, and data science.
Fiona is a Python library for reading and writing geographic data files, with support for CLI usage.
A PostgreSQL extension that adds HyperLogLog data structures as a native data type.
A port of Great Expectations to dbt test macros for data testing and validation in data engineering workflows.
A Python package for processing earth-observing satellite data with support for common data formats and tools.
A collection of data science, machine learning, and web development project code for Dataquest's YouTube channel.
A Python library for arbitrary-precision floating-point arithmetic, providing advanced numerical capabilities.
Scripts to download genomes from the NCBI FTP servers for bioinformatics and genomics research.
A simple SQLite file viewer that allows you to view and explore SQLite databases online.
A Kotlin library for structured data processing, suitable for data analysis and data science tasks.
A fast and flexible R package for reading flat files (CSV, TSV, fixed-width) into R data frames.
An open-source C++ framework for fast and parallel map matching of GPS trajectories.
A comprehensive English word database with translations, parts of speech, and definitions for developers.
MySQL binlog incremental subscription and consumption component
Get weekly updates on trending AI coding tools and projects.