Category
Showing 151-200 of 897 trending projects
A toolkit for SQLite databases, focused on application development with a Swift-based API.
A free, open-source Python library for fetching real-time stock data from Chinese stock exchanges.
MongoDB data stream pipeline tools for managing real-time data synchronization and replication.
A Postgres extension for high-performance vector search, complementing pgvector for scale.
Anatomy of Matplotlib tutorial for SciPy conference, focused on data visualization for scientific computing.
Reactive, local-first database for JavaScript apps with real-time sync and flexible storage
Compilation of R and Python programming codes for data science and machine learning projects.
A fast, scalable, and distributed database for transactional, analytical, and AI workloads.
Open-source graph database optimized for dynamic analytics and streaming data environments.
Distributed transactional key-value database, originally created to complement TiDB
Unified cloud-native data warehouse platform for analytics, search and AI, built on top of S3 storage.
MySQL binlog incremental subscription and consumption component
A Python library that helps ensure data quality and reliability through data profiling and testing.
A Redis-compatible database implemented in Go, supporting SQL and multiple backends like PostgreSQL and SQLite.
Nebula is a fast, open-source, distributed graph database with horizontal scalability and high availability.
A powerful customer data pipeline for collecting, processing, and analyzing user events and behavior.
A comprehensive list of learning materials to help developers understand database internals.
An extensible, high-performance columnar file format for data storage and processing.
A distributed database with CRDT sync, offline support, and end-to-end encryption for vibe coders.
A personal data aggregator and analysis tool for self-tracking and quantified self enthusiasts.
A multi-page Streamlit app for geospatial data visualization and analysis, useful for housing and real estate applications.
GlobalBuildingAtlas is an open global and complete dataset of building polygons, heights and LoD1 3D models.
A Kotlin library for structured data processing, suitable for data analysis and data science tasks.
SheetJS Spreadsheet Data Toolkit for data extraction and spreadsheet generation.
An open-source multi-tool for exploring and publishing data, focused on simplifying data analysis and sharing.
A repository of open-source data sets created for stories on The Pudding, a digital publication focused on data journalism.
This is a Python project for big data analysis, focusing on HQL, SQL, and data processing.
Fastest open-source data pipeline tool for replicating databases to data lakes in Apache Iceberg format.
A Python library for implementing the Louvain community detection algorithm on graphs.
Fast local PDF-to-Markdown/JSON converter for RAG pipelines. No GPU needed.
A high-performance, embeddable key-value storage engine written in Rust for developers building data-intensive applications.
Kibana is an open-source data visualization and management tool for Elasticsearch
Statsmodels is a Python library for statistical modeling and econometrics, providing tools for data analysis and prediction.
A high-performance, distributed data integration tool for batch, streaming, and CDC use cases.
An automatic database ORM library for Objective-C that provides thread-safe and deadlock-free database operations.
MySQL Connector/J is a JDBC driver that enables Java applications to connect to MySQL databases.
Blazing-fast data wrangling toolkit for AI and data engineering workflows
A comprehensive cookbook for data engineers, covering best practices, big data, and data engineering concepts.
Redisson is a Java client for Redis and Valkey with distributed objects and services
An open-source, self-hosted database management tool with a spreadsheet-like interface for Postgres
Dask is a Python library for parallel computing and distributed data processing, providing a scalable alternative to NumPy and Pandas.
A C++ library for processing data streams, potentially useful for vibe coders working with AI-powered tools.
An open-source COVID-19 dashboard powered by the fastpages framework, featuring data visualizations.
A Python library that provides efficient, Pythonic data structures for sorted lists, dictionaries, and sets.
A curated list of awesome big data frameworks, resources and other awesomeness.
An open-source data lakehouse framework that enables building data pipelines with leading big data compute engines.
A comprehensive guide to big data technologies like Hadoop, Spark, Kafka, and more for developers.
A PostgreSQL sample database for testing and learning SQL queries.
Scalable and efficient data transformation framework with backwards compatibility for dbt.
AgensGraph is a transactional graph database based on PostgreSQL for enterprise-level applications.
Get weekly updates on trending AI coding tools and projects.