Category
Showing 51-100 of 897 trending projects
A Python library for data manipulation and analysis, part of the core data science toolkit.
Open-source, free A-share quantitative trading data platform focused on China's stock market
Idempotent schema management tool for MySQL, PostgreSQL, SQLite, and SQL Server databases.
A Python script to fetch Garmin health data and populate it in an InfluxDB database for visualization in Grafana.
An open-source data orchestration platform for developing, running, and observing data pipelines and workflows.
Fast, typo-tolerant search engine for building delightful search experiences
JuiceFS is a distributed POSIX file system built on top of Redis and S3 for big data and cloud-native applications.
A high-performance Python library for data extraction, analysis, conversion and manipulation of PDF and other documents.
Embeddable, persistent key-value store for fast storage with LSM design
libSQL is an open-source, open-contribution fork of SQLite, a widely used embedded database.
Fast local PDF-to-Markdown/JSON converter for RAG pipelines. No GPU needed.
An open-source data catalog platform for building a high-performance, federated metadata lake.
An extensible, high-performance columnar file format for data storage and processing.
dbt enables data analysts and engineers to transform data using software engineering practices.
Comprehensive Chinese poetry database with JSON-formatted data for developers
A Python library for creating circular data visualizations like Circos plots, chord diagrams, and radar charts.
SnappyData is a memory-optimized analytics database based on Apache Spark and Apache Geode, enabling real-time stream processing, transactions, and predictive analytics.
Open-source relational database management system (RDBMS) for building data-driven applications.
GORM is a developer-friendly ORM library for Golang, offering features like associations, hooks, and auto migrations.
A geospatial data library for Ruby that provides a set of tools for working with geographic data.
Redis desktop manager with GUI for managing Redis databases on Linux, Windows, Mac
Data quality assessment and reporting tool for data frames and database tables in R
A Rust library to work with the Arrow data format, without requiring the Transmute crate.
Cloud-native distributed SQL database for modern applications
Official Git mirror of the SQLite source tree, a popular and widely-used embedded database engine.
A curated list of free/public domain text datasets for natural language processing (NLP) tasks.
Open-source data warehouse learning project with examples and code for building real-time and offline data pipelines.
A Python package for accessing and analyzing Formula 1 racing data, including results, schedules, timing, and telemetry.
High-performance distributed graph database for real-time use cases
Apache Doris is a high-performance, unified analytics database for real-time data processing.
A powerful Python library for record linkage and duplicate detection in data-driven applications.
Apache DataFusion is a powerful SQL query engine written in Rust, designed for big data processing and analysis.
An open-source Python library that simplifies the process of loading data into data lakes and warehouses.
An open-source PostgreSQL client application for macOS, providing an easy way to set up and manage a local PostgreSQL database.
A comprehensive English word database with translations, parts of speech, and definitions for developers.
A database modeling language (DBML) that helps define and document database structures.
MongoEngine is a Python Object-Document-Mapper (ODM) for working with MongoDB databases.
The ultimate set of SQLite extensions for developers building applications with SQLite databases.
Trino is a distributed SQL query engine for big data, allowing fast, scalable, and cost-effective analytics.
A Rust-based, Elasticsearch-quality search engine for PostgreSQL, enabling fast, real-time analytics and HTAP use cases.
A high-performance open source query engine for sub-second analytics on data lakehouse.
efinance is a Python library for quickly accessing financial data (funds, stocks, bonds, futures) and backtesting/quantitative trading.
A library that allows developers to use LINQ to retrieve data from spreadsheets and CSV files.
db.py is a Python library that provides an easier way to interact with your databases.
Get weekly updates on trending AI coding tools and projects.