Category
Showing 551-600 of 897 trending projects
A high-quality dataset repository for building recommender systems, useful for vibe coders working on AI-powered applications.
A highly scalable, distributed, document-oriented NoSQL database with full-text search, spatial, and time-series support.
Apache Avro is a data serialization system for efficient storage and transmission of structured data.
A Python library for pulling current and historical baseball statistics, including Statcast, Baseball Reference, and FanGraphs data.
A Python library providing SQL views for Dune Analytics, a popular blockchain data analysis platform.
A powerful suite of sparse matrix algorithms and libraries for scientific and numerical computing.
This repository contains a collection of portfolio projects for a data analyst, not a developer discovery platform.
This GitHub repository provides tutorials on effectively using the Pandas library for data analysis.
A powerful, multi-database ORM for .NET that supports a wide range of SQL databases and provides a seamless data access layer.
Deequ is a Scala library for defining "unit tests for data" to measure data quality in large datasets.
A Python library for extracting data from a wide range of internet sources into a pandas DataFrame.
A Go-based tool for database anonymization and synthetic data generation to help with security, QA, and data masking.
PumpkinDB is an immutable, ordered key-value database engine written in Rust.
A cross-platform way to express data transformation, relational algebra, and standardized record expression and plans.
An embedded time-series database written in Go for storing and querying metrics data.
A cloud-native PostgreSQL database developed by Alibaba Cloud for high-performance, scalable data storage and management.
A collection of stock analysis tools across various programming languages and platforms.
Entity Framework Core provider for PostgreSQL, enabling .NET developers to easily interact with PostgreSQL databases.
A curated list of Google Earth Engine resources for geospatial analysis and remote sensing applications.
An open-source platform for building and sharing datasets, focused on trust, privacy, and decentralization.
Redis GUI client joining forces with Redis to enhance developer experience
Alluxio is an open-source data orchestration platform for analytics and machine learning workloads in the cloud.
A Rust-based graph database for developers who need to store and query connected data.
Agile data preparation workflows made easy with popular Python data science libraries.
Open Babel is a chemical toolbox for working with chemical data and cheminformatics.
A simple SQLite file viewer that allows you to view and explore SQLite databases online.
An open-source global repository of address, building, and parcel data for developers and geospatial applications.
A registry of publicly available datasets hosted on AWS for data-driven developers.
An end-to-end data pipeline for building a data lake, data warehouse, and analytics platform from GoodReads data.
A Python data analysis library optimized for humans instead of machines.
A comprehensive set of Python notes and resources for developers, covering a wide range of topics including data science, machine learning, and scientific computing.
Cloud-based database manager UI for querying, managing, and visualizing databases across multiple platforms.
Graph and network visualization library for R developers working with tabular data
Open-source relational database engine powering web apps, APIs, and data-driven backends worldwide.
SQLBoiler is a Go ORM that generates code tailored to your database schema, making it easy to interact with databases.
Automatically visualize your pandas dataframes with a single print command, enabling quick EDA.
An open-source index of Google Trends data, useful for developers building data-driven applications.
Automatically generates beautiful and easy-to-read ER diagrams from your database.
A curated list of awesome JSON datasets that don't require authentication.
A powerful C library for analyzing complex networks and graph-based data structures.
Highly available PostgreSQL cluster using Docker, focused on data infrastructure for developers.
A Python script that generates a CSV file with data about players in the English Premier League Fantasy League.
OrientDB is a versatile, multi-model DBMS that supports Graph, Document, Reactive, Full-Text, and Geospatial models.
Fluid is a distributed data abstraction and acceleration framework for Big Data and AI applications on the cloud.
Poisson Surface Reconstruction is a C++ library for reconstructing surfaces from point cloud data.
Percona Toolkit is a collection of advanced open source database tools for MySQL, MongoDB, and PostgreSQL.
A curated list of Python packages for chemistry, including computational chemistry, molecular dynamics, and quantum chemistry.
A Python driver for the ClickHouse database with native interface support.
A DICOM to NIfTI converter for medical imaging research and neuroimaging applications.
Get weekly updates on trending AI coding tools and projects.