Category
Showing 101-150 of 897 trending projects
Embeddable, persistent key-value store for fast storage with LSM design
A versatile app for querying, scripting, and visualizing data from various databases, files, and APIs.
Redis desktop manager with GUI for managing Redis databases on Linux, Windows, Mac
A comprehensive index of medical imaging datasets for researchers and developers working in the medical imaging field.
efinance is a Python library for quickly accessing financial data (funds, stocks, bonds, futures) and backtesting/quantitative trading.
A Java-based database subsetting and relational data browsing tool for popular databases.
A comprehensive search tool for finding Chinese NLP datasets, with support for common English NLP datasets as well.
A high-performance, embeddable key-value storage engine written in Rust for developers building data-intensive applications.
A Redis-compatible database implemented in Go, supporting SQL and multiple backends like PostgreSQL and SQLite.
Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.
A Python library to programmatically access and manage photos and metadata in the Apple Photos library on macOS.
The ultimate set of SQLite extensions for developers building applications with SQLite databases.
Portfolio analytics library for quantitative finance, built with Python
A comprehensive collection of 150+ Python programs for quantitative finance and stock market data analysis.
A Python library for creating circular data visualizations like Circos plots, chord diagrams, and radar charts.
A high-performance GPU DataFrame library for data analysis and machine learning workloads.
An embeddable, replicated, and fault-tolerant SQL engine for building robust and scalable applications.
Hamilton is an open-source ETL framework that helps data scientists and engineers build modular, testable dataflows with lineage and metadata.
A Postgres extension for high-performance vector search, complementing pgvector for scale.
Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy
An extensible, high-performance columnar file format for data storage and processing.
Apache DataFusion is a powerful SQL query engine written in Rust, designed for big data processing and analysis.
A high-performance open source query engine for sub-second analytics on data lakehouse.
A Rust library to work with the Arrow data format, without requiring the Transmute crate.
A quantitative research and stock analysis platform for finance professionals.
SnappyData is a memory-optimized analytics database based on Apache Spark and Apache Geode, enabling real-time stream processing, transactions, and predictive analytics.
A free, open-source Python library for fetching real-time stock data from Chinese stock exchanges.
A geospatial data library for Ruby that provides a set of tools for working with geographic data.
Apache Doris is a high-performance, unified analytics database for real-time data processing.
A Python library for crawling historical data of China stocks.
Distributed MySQL database system for horizontal scaling
Fluvio is an event stream processing engine for developers to build responsive data-intensive apps.
Data quality assessment and reporting tool for data frames and database tables in R
Comprehensive dataset of China's administrative divisions (province, city, county, town) in JSON, CSV, and SQL formats.
Trino is a distributed SQL query engine for big data, allowing fast, scalable, and cost-effective analytics.
A curated list of tools and datasets for anomaly detection on time-series data.
Google's Operations Research tools for combinatorial optimization, linear programming, and operations research.
dplyr is a powerful R library for data manipulation, providing a grammar of data manipulation.
Open-source data warehouse learning project with examples and code for building real-time and offline data pipelines.
PyWavelets is a Python library for wavelet transform algorithms and techniques, useful for image and signal processing.
A powerful Python library for record linkage and duplicate detection in data-driven applications.
A high-performance Java library for data analysis, visualization, and machine learning.
A data access layer (DAL) and ORM-like library for working with SQL and NoSQL databases in Go.
Apache Arrow is a fast columnar data format and toolset for in-memory analytics and data interchange.
Apache Iceberg is an open-source table format for large analytic datasets, providing a versioned and scalable data lake architecture.
An open-source Python library that simplifies the process of loading data into data lakes and warehouses.
Get weekly updates on trending AI coding tools and projects.