Category
Showing 851-897 of 897 trending projects
A lightweight key-value store built with C++ using a skiplist data structure.
A collection of Python code, notebooks, and examples for practical business data analysis and visualization.
Fast n-dimensional filtering and grouping of records, a powerful data manipulation library for JavaScript.
Self-Driving Database Management System from Carnegie Mellon University
A Python client library for interacting with the InfluxDB time-series database.
Sample datasets for users of the Yelp Academic Dataset, useful for data analysis and machine learning.
The versioned, forkable, syncable database for developers who need a scalable, distributed data solution.
A JavaScript library that provides a NumPy-like interface for working with multi-dimensional arrays and matrices.
Crafty statistical graphics library for the Julia programming language
This Python repository contains code examples and notes for data analysis and mining.
This R library provides historical investment returns analysis for the overall stock market.
A distributed knowledge graph store built in Go for managing large-scale semantic data.
A data quality and observability tool for monitoring and fixing data issues before they become problems.
A popular Scala library for parsing and manipulating JSON data in Scala applications.
Dex is a powerful data visualization tool that enables data exploration and publishing of web visualizations.
A C++ library for processing data streams, potentially useful for vibe coders working with AI-powered tools.
An automatic database ORM library for Objective-C that provides thread-safe and deadlock-free database operations.
Python library for using dplyr-like syntax with pandas and SQL databases
Mondrian is an OLAP server that enables real-time analysis of large data sets for business users.
Kylo is an enterprise-grade data lake management platform built on big data technologies like Spark and Hadoop.
Prisma1 is a database toolkit with an ORM, migrations, and admin UI for Postgres, MySQL, and MongoDB.
An interactive and reactive data science platform powered by Scala and Apache Spark.
A fast B+ tree indexing structure in C for efficient storage and retrieval of billions of key-value pairs.
Concurrent data pipelines in Python for building efficient and scalable data processing workflows.
pandasql is a Python library that allows developers to use SQL syntax to query Pandas DataFrames.
Java client library for connecting to the InfluxDB time series database.
This repository provides a comprehensive guide and implementations for data algorithms using MapReduce, Spark, Java, and Scala.
A data science IDE for Python, focused on providing a user-friendly environment for data analysis and visualization.
A Python library for processing and visualizing satellite imagery data.
A large-scale entity and relation database supporting aggregation of properties for big data applications.
A data processing and ETL (Extract, Transform, Load) framework for Ruby developers.
A Python library for building business intelligence (BI) and OLAP solutions.
A Python library providing multivariate imputation and matrix completion algorithms.
This is a C++ repository for a Kaggle competition in 2014, not a developer discovery platform.
This GitHub repository provides time series data on COVID-19 cases, useful for data analysis and visualization.
An ORM for RethinkDB that provides an elegant and intuitive API for interacting with the database.
A Chinese translation of the book 'Python for Data Analysis' 2nd Edition, covering NumPy, Pandas, and other data analysis tools.
A simple embedded database library in Rust modeled after SQLite, useful for Rust projects.
Non-native graph database abstraction layer for Node.js and web browsers.
An intuitive Python library that adds plotting functionality to scikit-learn machine learning models
Cloud-native, MySQL-compatible, AI-ready database with Git for Data, vector search, and full-text search capabilities.
Grid Studio is a web-based application for data science with full integration of open source data science frameworks and languages.
A collection of data science related questions and answers for developers.
COVID-19 data repository for developers, providing daily updated case, death, and testing information.
EasyDB is a lightweight desktop app that lets you query local CSV, Excel, and JSON files with SQL, without an external database.
A no-code, visual data integration platform for building big data pipelines and workflows.
Get weekly updates on trending AI coding tools and projects.