Data & Databases

ORMs, query builders, databases, and data pipelines

Showing 141-160 of 5,250 projects

apache/druid

Apache Druid is a high-performance real-time analytics database for vibe coders working with data-intensive applications.

14.0K
Active
Java
Databases
#real-time-analytics#data-processing#high-performance

dask/dask

Dask is a Python library for parallel computing and distributed data processing, providing a scalable alternative to NumPy and Pandas.

13.8K
Active
Python
Databases
Python
#parallel-computing#distributed-data-processing#data-analysis

sql-js/sql.js

A JavaScript library that allows you to run SQLite on the web, enabling local database functionality for web apps.

13.6K
Experimental
JavaScript
Databases
JavaScript
#sqlite#javascript#web-assembly

Data-Centric-AI-Community/ydata-profiling

A Python library for fast, customizable, and interactive data profiling and exploratory data analysis.

13.4K
Active
Python
Data Profiling
Python
#data-profiling#exploratory-data-analysis#data-quality

juicedata/juicefs

JuiceFS is a distributed POSIX file system built on top of Redis and S3 for big data and cloud-native applications.

13.3K
Active
Go
Databases
Go
#object-storage#s3#redis

K-Dense-AI/claude-scientific-skills

A library of ready-to-use scientific skills for the Claude AI assistant, focused on bioinformatics, drug discovery, and scientific computing.

13.2K
Active
Python
LLM Wrappers & SDKs
API Frameworks
Python
#claude#scientific-computing#bioinformatics

google/or-tools

Google's Operations Research tools for combinatorial optimization, linear programming, and operations research.

13.2K
Active
C++
Operations Research
#combinatorial-optimization#linear-programming#operations-research

datastacktv/data-engineer-roadmap

This is a roadmap for becoming a data engineer, not a developer discovery platform for vibe coders.

12.7K
Archived
Data Engineering
#data-engineering#roadmap#cloud

trinodb/trino

Trino is a distributed SQL query engine for big data, allowing fast, scalable, and cost-effective analytics.

12.6K
Active
Java
Databases
#big-data#analytics#data-science

debezium/debezium

An open-source framework for change data capture from various databases using Apache Kafka.

12.5K
Active
Java
ETL & Pipelines
Apache Kafka
#change-data-capture#event-streaming#database

citusdata/citus

Citus is a distributed PostgreSQL database that enables scaling out your Postgres database across multiple nodes.

12.3K
Active
C
Databases
#distributed-database#postgresql#relational-database

dbt-labs/dbt-core

dbt enables data analysts and engineers to transform data using software engineering practices.

12.3K
Active
Python
ETL & Pipelines
Python
#analytics#business-intelligence#data-modeling

mysql/mysql-server

Open-source relational database engine powering web apps, APIs, and data-driven backends worldwide.

12.2K
Active
C++
Databases
Node.js
#mysql#relational-database#sql

vesoft-inc/nebula

Nebula is a fast, open-source, distributed graph database with horizontal scalability and high availability.

12.1K
Stable
C++
Databases
C++
#database#graph-database#distributed

OpenRefine/OpenRefine

OpenRefine is a powerful data cleaning and transformation tool that helps developers work with messy data.

11.8K
Active
Java
Data Cleaning & Transformation
Java
#data-analysis#data-wrangling#data-cleaning

TA-Lib/ta-lib-python

Python wrapper for the TA-Lib technical analysis library, useful for financial pattern recognition.

11.8K
Active
Cython
Libraries
Python
#finance#pattern-recognition#quantitative-finance

dhamaniasad/awesome-postgres

A curated list of awesome PostgreSQL software, libraries, tools and resources.

11.7K
Active
Databases
#postgresql#database#cli

Tencent/wcdb

WCDB is a cross-platform database framework developed by WeChat for Android, iOS, Linux, macOS, and Windows.

11.7K
Active
C
Databases
#database#cross-platform#android

originalankur/maptoposter

A Python library that transforms cities into beautiful, minimalist map posters with code.

11.6K
Active
Python
Charts & Visualization
Backend Frameworks
Python
#maps#visualization#design

datahub-project/datahub

An open-source metadata platform for managing your data and AI stack across the enterprise.

11.6K
Active
Java
Data Catalog
#data-catalog#data-discovery#data-governance
1...79...263

Stay in the loop

Get weekly updates on trending AI coding tools and projects.