Category
Showing 501-550 of 897 trending projects
Self-Driving Database Management System from Carnegie Mellon University
A database solution that provides better analytics on top of MongoDB and makes it easier to migrate from MongoDB to SQL.
An open-source, community-driven platform for data-intensive scientific analysis and visualization.
This R library provides historical investment returns analysis for the overall stock market.
A high-performance, highly available, and distributed time series database written in Rust.
A repository of NumPy exercises for developers looking to improve their Python and data manipulation skills.
Graph and network visualization library for R developers working with tabular data
A comprehensive repository covering big data knowledge, including data warehouse modeling, real-time computing, Hadoop, Spark, and more.
Documentation for the popular .NET ORM Entity Framework Core and Entity Framework 6.
This repository provides a comprehensive dataset of over 850,000 Chinese poems from ancient to modern times, making it a valuable resource for developers working with Chinese poetry.
The Auron accelerator framework leverages vectorized execution to speed up distributed computing on big data platforms like Spark.
A tutorial for performing statistical data analysis using Python, covering topics like regression, hypothesis testing, and more.
PaxosStore is a high-performance, distributed database solution built for large-scale applications.
A collection of data science take-home challenges and solutions implemented in Jupyter Notebooks.
A fast and accurate short-read sequence aligner written in C for genomics applications.
Utility functions for dbt projects, a popular data transformation tool for data engineers.
TuGraph-DB is a high-performance graph database built for fast and efficient graph data processing.
A .NET Standard library that provides strongly typed exceptions for Entity Framework Core across multiple database providers.
A Python client library for interacting with the InfluxDB time-series database.
A high-performance, MySQL-compatible vector database that supports structured and unstructured data for AI-driven applications.
R kernel for the Jupyter notebook environment, enabling interactive R programming in Jupyter.
A Python script that generates a CSV file with data about players in the English Premier League Fantasy League.
Curated list of Python software and packages for scientific research in audio
A Python library for reading and writing a wide range of image and video formats, including DICOM, animated GIFs, and webcam capture.
A modern, embedded SQL database written in Go for embedded and mobile applications.
A Rust library that provides persistent data structures for efficient and immutable data management.
A flexible and powerful SQL string builder library plus a zero-config ORM for Go developers.
An R package for training and plotting classification and regression models.
This GitHub repository contains SQL data analysis and visualization projects using various tools and databases.
ggplot2 is a powerful data visualization library for R that provides elegant and flexible graphics.
A registry of publicly available datasets hosted on AWS for data-driven developers.
Apache Spark and Python tutorials for big data analysis and machine learning as Jupyter notebooks.
A distributed knowledge graph store built in Go for managing large-scale semantic data.
A curated list of awesome MATLAB frameworks, libraries, and software for scientific computing and data analysis.
A Rust library for quantitative finance, including tools for machine learning, option pricing, and trading.
A persistent, relational store inspired by Datomic and DataScript, written in Rust.
A curated list of awesome resources for the data transformation tool dbt, focused on analytics engineering.
Performant probabilistic data structures for processing continuous, unbounded streams in Go.
A blazingly fast analytics database built with Rust, optimized for rapidly devouring large amounts of data.
A functional, type-safe, composable Scala data access library for Postgres databases.
A comprehensive guide to feature engineering and feature selection techniques in Python, with examples.
A Python library that allows developers to easily draw datasets within their notebooks.
SQLite with Branches - a lightweight, embedded database with version control capabilities.
A fast, lightweight SQLite-based persistence layer with CloudKit synchronization for Swift developers.
A C++ library for importing OpenStreetMap data into a PostgreSQL/PostGIS database.
An advanced ORM library for Java and Kotlin developers that provides powerful caching and data management features.
An open-source COVID-19 dashboard powered by the fastpages framework, featuring data visualizations.
A standard filetree template for data curation and organization, useful for developers interested in data management.
SQL Lineage Analysis Tool that provides data discovery and governance insights through Python.
A Java connector for integrating MongoDB with Hadoop ecosystems for big data processing.
Get weekly updates on trending AI coding tools and projects.