Category
Showing 201-250 of 897 trending projects
A beginner-friendly Python toolkit for financial data extraction, analysis, and automation.
Distributed SQL database middleware for sharding, scalability, and security
An acoustic spectrum analyzer library written in C++ for audio analysis and visualization.
An Awesome List of open-source data engineering projects for developers.
A Chinese translation of a popular book on using Python for data analysis with libraries like pandas and numpy.
A collection of football analytics projects, data, and analysis by Edd Webster (@eddwebster).
ArcticDB is a high-performance, serverless DataFrame database for the Python data science ecosystem.
OctoSQL is a powerful SQL query tool that allows you to join, analyze, and transform data from multiple databases and file formats.
A highly scalable, high-performance graph database that supports over 100 billion data points.
A high-performance, concurrent, embedded key-value database written in Rust for vibe coders.
A Rust library for interacting with Delta Lake, a data lake storage format, with Python bindings.
Hamilton is an open-source ETL framework that helps data scientists and engineers build modular, testable dataflows with lineage and metadata.
A Python library for portfolio optimization using scikit-learn and convex optimization techniques.
This GitHub repository provides a collection of Bible versions and cross-reference databases, but it does not appear to be related to the given developer discovery platform focused on vibe coders.
Kedro is a Python toolkit for building production-ready data science and machine learning pipelines.
A fast, scalable, and distributed database for transactional, analytical, and AI workloads.
This is a Python project for big data analysis, focusing on HQL, SQL, and data processing.
A Rust library that provides multi-writer and CRDT support for SQLite databases.
WebAssembly version of the DuckDB analytical database, enabling fast in-browser analytics and SQL queries.
Fast, lightweight search backend alternative to Elasticsearch
Biopython is a set of Python modules that provide a wide range of functionality for bioinformatics, including DNA/RNA/protein sequence analysis, phylogenetics, and more.
A Python library for financial data visualization using Matplotlib, focused on candlestick and OHLC charts.
Open-source repository for sharing code related to the MIMIC family of critical care databases.
Redisson is a Java client for Redis and Valkey with distributed objects and services
Dask is a Python library for parallel computing and distributed data processing, providing a scalable alternative to NumPy and Pandas.
An open-source, scalable, and fault-tolerant NoSQL database with a focus on reliability and offline-first design.
A collection of notebooks covering quantitative finance and numerical methods in Python.
Flink CDC is a streaming data integration tool that enables real-time data pipelines and change data capture.
Utility functions for dbt projects, a popular data transformation tool for data engineers.
A curated list of software packages and data resources for single-cell analysis, including RNA-seq and ATAC-seq.
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark.
An ultra-lightweight database that supports key-value and time series data for embedded and IoT applications.
Fast, single-binary C++ SQL ETL pipeline for stream processing, observability, analytics, and AI/ML.
A robust Python library for materials analysis and computational materials science.
DiceDB is an open-source, fast, reactive, in-memory database optimized for modern hardware.
A Java-based database subsetting and relational data browsing tool for popular databases.
An open-source, community-driven platform for data-intensive scientific analysis and visualization.
A tutorial and implementation of a disease-centered medical knowledge graph and QA system.
A Python library that provides efficient, Pythonic data structures for sorted lists, dictionaries, and sets.
This is a code repository for a book on practical statistics for data scientists, not a developer discovery platform.
Sample database for SQL Server, Oracle, MySQL, PostgreSQL, SQLite, DB2
Tonbo is an embedded database for serverless and edge runtimes, optimized for offline-first and big data use cases.
A collection of code snippets and tutorials for data science and data analysis in Python.
ArangoDB is a multi-model database supporting documents, graphs, and key-values for high-performance applications.
A high-performance GPU DataFrame library for data analysis and machine learning workloads.
mage-ai is a Python-based platform for building, running, and managing data pipelines and integrating/transforming data.
The ultimate set of SQLite extensions for developers building applications with SQLite databases.
Linq to database provider for .NET, supporting various database engines.
A Python library for creating easy-to-use, visually appealing data tables and summaries.
sq is a Go-based data wrangling tool that supports a variety of data formats and databases.
Get weekly updates on trending AI coding tools and projects.