Category
Showing 451-500 of 897 trending projects
Mimesis is a fast Python library for generating fake data in multiple languages for testing and development purposes.
This repository provides code examples for Oracle's AI-enabled database features and integrations.
An open-source distributed SQL database with high availability, scalability, and ACID transactions.
This GitHub repository contains SQL data analysis and visualization projects using various tools and databases.
A data repository for the data journalism site FiveThirtyEight, containing data and code behind their articles and graphics.
This is a comprehensive learning resource for the Flink stream processing framework, covering concepts, principles, and real-world use cases.
The ultimate set of SQLite extensions for developers building applications with SQLite databases.
A Rust library for quantitative finance, including tools for machine learning, option pricing, and trading.
A lightweight data processing framework built on DuckDB and 3FS for vibe coders working with AI tools.
A Python library for downloading, parsing, and analyzing health data from Garmin, FitBit, and MS Health.
Embedded Go Database, a fast open-source NoSQL database solution for Go projects.
An automatic DBMS configuration tool for optimizing database performance.
A curated list of Polars, an open-source, high-performance data manipulation library for Python and Rust.
Apache Pinot is a realtime distributed OLAP datastore for fast querying of large datasets.
Olric is a distributed, in-memory key/value store and cache for Go applications and services.
Bytewax is a Python library for building scalable, fault-tolerant, and low-latency data processing pipelines.
An advanced ORM library for Java and Kotlin developers that provides powerful caching and data management features.
Graft is an open-source transactional storage engine optimized for lazy, partial, and strongly consistent replication, ideal for edge, offline-first, and distributed applications.
Utility functions for dbt projects, a popular data transformation tool for data engineers.
A curated list of awesome resources for network analysis and visualization, with a focus on R tools.
A powerful data visualization and plotting library for the Julia programming language.
A web scraping tool for collecting data from Xiaohongshu, Bilibili, and other Chinese social platforms.
ActiveRecord-like API for CoreData, a powerful object-relational mapping (ORM) for iOS development.
A simple JSON data set of country information, useful for building apps that need country data.
An embeddable, replicated, and fault-tolerant SQL engine for building robust and scalable applications.
Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.
An in-process OLAP SQL Engine powered by ClickHouse, enabling fast and efficient data analysis.
A comprehensive Python library for color science and color space conversions.
RRDtool is a time-series database system for efficiently storing and graphing data.
C++ DataFrame library for statistical, financial, and machine learning analysis.
Irmin is a distributed database that follows the same design principles as Git, allowing for distributed version control of data.
A Python library that provides support for the pgvector vector database, enabling efficient vector search and storage.
LibRaw is a C++ library for reading RAW image files from digital cameras.
This Python library provides additional linear models for statistical modeling and analysis.
A Python library that implements database internals from scratch, useful for learning database concepts.
The official C++ client API for PostgreSQL, providing a high-level interface for interacting with PostgreSQL databases.
Connect processes into powerful data pipelines with a simple git-like filesystem interface
Eloquent ORM for Java 8, 11, 17, 21, 23 and Spring boot 2.x, 3.x
Provides Bayesian data analysis demos in Python for developers interested in probabilistic modeling.
KurrentDB is an event-native database designed for modern software and event-driven architectures.
A collection of code examples and baselines for common data science and machine learning competitions.
Open-source massively parallel processing (MPP) database, an alternative to Greenplum.
Cloud-native genomic dataframes and batch computing for bioinformatics and genetics research.
SSDB is a fast NoSQL database, an alternative to Redis, with support for leveldb and rocksdb backends.
A simple, fast and versatile Datalog database written in Clojure for vibe coders.
A collection of code snippets and tutorials for data science and data analysis in Python.
A fast and scalable library for reading and writing spreadsheet files (CSV, XLSX, ODS) in PHP.
A Go library with types and utilities for working with 2D geometry, geospatial data, and mapping.
Linq to database provider for .NET, supporting various database engines.
PyPika is a Python SQL query builder that provides a readable, Pythonic syntax for constructing complex SQL queries.
Get weekly updates on trending AI coding tools and projects.