Category
Showing 51-100 of 897 trending projects
Zeppelin is a web-based notebook that enables data-driven, interactive data analytics and collaborative documents.
Automatically visualize your pandas dataframes with a single print command, enabling quick EDA.
A collection of Python code examples and tutorials for data science, machine learning, and web development.
Distributed event streaming platform for data pipelines and real-time apps
Workflow orchestration for resilient data pipelines in Python
Modern SQL client for multiple databases
Modern in-memory key-value store for caching and data management
Open-source, free A-share quantitative trading data platform focused on China's stock market
A unified metadata platform for data discovery, data observability, and data governance.
A comprehensive list of learning materials to help developers understand database internals.
A curated list of awesome R packages, frameworks and software for data analysis and data science.
JuiceFS is a distributed POSIX file system built on top of Redis and S3 for big data and cloud-native applications.
Tools to download and cleanup Common Crawl data, a large web crawl dataset, for further analysis and processing.
Realm is a mobile database that serves as a replacement for SQLite and ORMs.
A Python library for common data analysis and machine learning tasks
Distributed key-value store for critical distributed system data
This is a Python project for big data analysis, focusing on HQL, SQL, and data processing.
A powerful GUI/CLI tool for biologists to work with NGS data, not a vibe coder tool.
Framework for collecting and analyzing prediction market data with comprehensive Polymarket/Kalshi datasets.
Data integration platform for ELT pipelines from APIs, databases & files to databases, warehouses & lakes
OrioleDB is a cloud-native PostgreSQL extension that solves performance and scalability challenges.
An open-source data orchestration platform for developing, running, and observing data pipelines and workflows.
CSV Data Source for Apache Spark 1.x, a Scala library for working with structured data.
Comprehensive Chinese poetry database with JSON-formatted data for developers
A cross-platform TUI database management tool written in Go for developers working with databases.
libSQL is an open-source, open-contribution fork of SQLite, a widely used embedded database.
An open-source index of Google Trends data, useful for developers building data-driven applications.
The Go kernel for Jupyter notebooks and nteract, enabling data science and numerical computing in Go.
mage-ai is a Python-based platform for building, running, and managing data pipelines and integrating/transforming data.
The versioned, forkable, syncable database for developers who need a scalable, distributed data solution.
A high-performance Python library for data extraction, analysis, conversion and manipulation of PDF and other documents.
A Python library for data manipulation and analysis, part of the core data science toolkit.
A curated list of software packages and data resources for single-cell analysis, including RNA-seq and ATAC-seq.
A Python library for financial data visualization using Matplotlib, focused on candlestick and OHLC charts.
A Python package for accessing and analyzing Formula 1 racing data, including results, schedules, timing, and telemetry.
GlobalBuildingAtlas is an open global and complete dataset of building polygons, heights and LoD1 3D models.
Ploomber is a fast and versatile tool for building and deploying data pipelines that can be used with a variety of AI and ML tools.
This is a collection of readings and resources related to databases, not a vibe coder platform.
A Python script to fetch Garmin health data and populate it in an InfluxDB database for visualization in Grafana.
Open-source relational database management system (RDBMS) for building data-driven applications.
An open-source data catalog platform for building a high-performance, federated metadata lake.
dbt enables data analysts and engineers to transform data using software engineering practices.
Official Git mirror of the SQLite source tree, a popular and widely-used embedded database engine.
Cloud-native distributed SQL database for modern applications
A Rust-based, Elasticsearch-quality search engine for PostgreSQL, enabling fast, real-time analytics and HTAP use cases.
PyPika is a Python SQL query builder that provides a readable, Pythonic syntax for constructing complex SQL queries.
Get weekly updates on trending AI coding tools and projects.