Category
Showing 601-650 of 897 trending projects
A Python library for portfolio optimization and back-testing in finance.
A versatile Python library for bioinformatics, providing data structures, algorithms, and educational resources.
An end-to-end data pipeline for building a data lake, data warehouse, and analytics platform from GoodReads data.
A Python library for performing multivariate exploratory data analysis, including techniques like PCA, CA, MCA, MFA, and FAMD.
This GitHub repository provides tutorials on effectively using the Pandas library for data analysis.
This is Facebook's branch of the Oracle MySQL database, including the MyRocks storage engine.
A collection of Unix, R, and Python tools for bioinformatics and data science projects.
A powerful customer data pipeline for collecting, processing, and analyzing user events and behavior.
A curated list of awesome materials and resources for database development.
A JavaScript library for efficient querying and transformation of array-backed data tables.
A high-performance B-tree implementation for Go, useful for building database-like applications.
SchemaCrawler is a free database schema discovery and comprehension tool that supports various database management systems.
Tegola is an open-source Mapbox Vector Tile server written in Go, enabling efficient geospatial data visualization.
A fast and efficient C++ hash map and hash set implementation using robin hood hashing.
A simple, fast and versatile Datalog database written in Clojure for vibe coders.
A Python library for reading and writing a wide range of image and video formats, including DICOM, animated GIFs, and webcam capture.
A Python tool that generates Entity Relationship Diagrams (ERDs) from SQLAlchemy models.
An open-source threat hunting platform built on the ELK stack for security researchers and analysts.
A DICOM to NIfTI converter for medical imaging research and neuroimaging applications.
A standard filetree template for data curation and organization, useful for developers interested in data management.
PySAL is a Python Spatial Analysis Library meta-package for geographical data analysis and modeling.
esProc SPL is a JVM-based programming language for structured data computation, serving as both a data analysis tool and an embedded computing engine.
A tutorial for performing statistical data analysis using Python, covering topics like regression, hypothesis testing, and more.
Useful scripts, UDFs, views, and other utilities for migration and data warehouse operations in BigQuery.
A fast, embeddable column database written in Go, optimized for AI/ML workloads.
A Python library for downloading, parsing, and analyzing health data from Garmin, FitBit, and MS Health.
A collection of PySpark examples covering RDD, DataFrame, and Dataset operations in Python.
Python code for causal inference, a book by Miguel Hernán and James Robins.
gget is a Python library that enables efficient querying of genomic reference databases like NCBI, Ensembl, and UniProt.
Azure Data Studio is a data management and development tool with connectivity to popular cloud and on-premises databases.
A tool for comparing and evaluating databases for time series data.
A parallel processing library for Pandas that improves performance on multi-core CPUs.
R kernel for the Jupyter notebook environment, enabling interactive R programming in Jupyter.
Cartopy is a Python library for creating maps and visualizing spatial data with matplotlib support.
ggstatsplot is an R library that enhances ggplot2 visualizations with statistical analysis and hypothesis testing.
DBngin is a free, open-source, cross-platform database management tool for developers.
GraphFrames provides DataFrame-based Graphs for Apache Spark, enabling scalable graph analysis and algorithms.
A unified interface for distributed computing on Spark, Dask and Ray without any rewrites.
A Go ORM and query builder for interacting with databases in Go applications.
A pure Go library for reading and writing Parquet files, a columnar data format.
Core database component for the Realm Mobile Database SDKs, a popular NoSQL database for mobile apps.
Open-source BI platform for engineers to explore and model large-scale data pipelines.
A modern, embedded SQL database written in Go for embedded and mobile applications.
A Rust data structure for efficiently storing and accessing data in a sparse set.
A data access layer (DAL) and ORM-like library for working with SQL and NoSQL databases in Go.
R package for Bayesian generalized multivariate non-linear multilevel models using Stan
A comprehensive resource for developers to learn and get started with data engineering using Python.
A collection of procedures for the Neo4j graph database, providing advanced graph algorithms and utilities.
A high-performance compression library written in C for developers working with large data sets.
A Python library for analyzing movement trajectory data using GeoPandas.
Get weekly updates on trending AI coding tools and projects.