Category
Showing 451-500 of 897 trending projects
An open-source data modeling tool designed for PostgreSQL, allowing developers to generate DDL commands visually.
An acoustic spectrum analyzer library written in C++ for audio analysis and visualization.
A Rust library for interacting with Delta Lake, a data lake storage format, with Python bindings.
A simple Python wrapper for the Tabula Java library, which extracts tables from PDF files into Pandas DataFrames.
A Python library for portfolio optimization using scikit-learn and convex optimization techniques.
A Rust data structure for efficiently storing and accessing data in a sparse set.
MyBatis SQL Mapper for Java simplifies database interactions with object mapping.
Apache Beam is a unified programming model for batch and streaming data processing.
OpenRefine is a powerful data cleaning and transformation tool that helps developers work with messy data.
A collection of football analytics projects, data, and analysis by Edd Webster (@eddwebster).
This repository provides a comprehensive guide on optimizing MySQL performance and solving common database problems.
A repository of NumPy exercises for developers looking to improve their Python and data manipulation skills.
A scalable, distributed ETL framework for building data lake analytics pipelines.
A Python helper library for enhancing Jupyter Notebooks with data visualization and analysis capabilities.
A comprehensive collection of geospatial tools and resources for data analysis, machine learning, and spatial applications.
Meltano is a declarative, code-first data integration engine for building and scaling data and ML-powered products.
This GitHub repository provides a collection of Bible versions and cross-reference databases, but it does not appear to be related to the given developer discovery platform focused on vibe coders.
A beginner-friendly Python toolkit for financial data extraction, analysis, and automation.
A fast and flexible R package for reading flat files (CSV, TSV, fixed-width) into R data frames.
A lightweight, document-oriented database optimized for happiness, used as a Python library or CLI.
OrioleDB is a cloud-native PostgreSQL extension that solves performance and scalability challenges.
A powerful suite of sparse matrix algorithms and libraries for scientific and numerical computing.
Build vector tilesets from large collections of GeoJSON features.
A high-performance, persistent, off-heap data structure written in Clojure for data-intensive applications.
An open-access book on scientific visualization using Python and Matplotlib for data-driven developers
Hamilton is an open-source ETL framework that helps data scientists and engineers build modular, testable dataflows with lineage and metadata.
A JavaScript statistical library that provides a wide range of statistical functions for data analysis.
A repository containing various NLP datasets collected and organized by the owner.
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark.
Curated list of Python software and packages for scientific research in audio
An open-source, TypeScript-based Entity-Relationship Diagram (ERD) editor for developers working with databases.
An end-to-end data pipeline for building a data lake, data warehouse, and analytics platform from GoodReads data.
Python library for clustering categorical data using k-modes and k-prototypes algorithms.
A fast and elegant data exploration library for Elixir, providing series and dataframes for data science workflows.
DBngin is a free, open-source, cross-platform database management tool for developers.
Fast, lightweight search backend alternative to Elasticsearch
A database modeling language (DBML) that helps define and document database structures.
A flexible and powerful SQL string builder library plus a zero-config ORM for Go developers.
MetricFlow allows developers to define, build, and maintain metrics in code for business intelligence and analytics.
Diagrams and documentation for InnoDB, the storage engine used by MySQL and MariaDB databases.
A parallel corpus of classical Chinese and modern Chinese texts for language processing and analysis.
PoloDB is an embedded document database written in Rust for building cross-platform, local-first applications.
Open-source repository for sharing code related to the MIMIC family of critical care databases.
Sample database for SQL Server, Oracle, MySQL, PostgreSQL, SQLite, DB2
A Python module for extracting and mapping Chinese province, city, and district data.
A Python package for handling messy CSV files with improved dialect detection and a command-line interface.
Efficient in-memory cache in Go for storing and retrieving large amounts of data.
A Rust library that provides multi-writer and CRDT support for SQLite databases.
Malloy is an open-source language for describing data relationships and transformations.
The Feldera Incremental Computation Engine is a Rust-based library for building real-time data pipelines and materialized views.
Get weekly updates on trending AI coding tools and projects.