Category
Showing 651-700 of 897 trending projects
A simple, fast and versatile Datalog database written in Clojure for vibe coders.
A Python library for analyzing movement trajectory data using GeoPandas.
A collection of Unix, R, and Python tools for bioinformatics and data science projects.
Pongo is a MongoDB-compatible database that runs on top of PostgreSQL, offering strong consistency benefits.
High-performance, transactional key-value database engine for embedded systems and cryptocurrencies.
A powerful Python package to manage and work with extremely large amounts of data.
An open-source financial data extraction tool that allows easy API access to web scrape data from various websites.
A curated list of Python packages for chemistry, including computational chemistry, molecular dynamics, and quantum chemistry.
A comprehensive collection of notes and resources for understanding different database technologies and concepts.
Quilt is a data mesh for connecting people with actionable data, built with TypeScript.
An educational project to build a disk-based key-value store in Python for learning purposes.
This repository provides best practices and examples for building ETL (Extract, Transform, Load) pipelines using Apache Airflow.
A Python-based image processing framework with plugins for common image processing libraries.
A visual data preparation tool powered by Python, designed for data analysis and ETL tasks.
pandasql is a Python library that allows developers to use SQL syntax to query Pandas DataFrames.
A collection of PySpark examples covering RDD, DataFrame, and Dataset operations in Python.
A tool to easily import CSV and JSON data into PostgreSQL databases.
A curated list of awesome database libraries, resources, and tools for developers.
Rust-based bindings for the NumPy C-API, enabling developers to leverage Rust for numerical computing.
PDAL is a C++ library for processing point cloud data, similar to GDAL for raster data.
Python code for causal inference, a book by Miguel Hernán and James Robins.
A Python library that implements database internals from scratch, useful for learning database concepts.
A fast, hierarchical key-value storage engine written in C++ for applications that require high performance and scalability.
A fast, in-memory B-tree implementation for sorted collections in Swift.
A Python package for handling messy CSV files with improved dialect detection and a command-line interface.
Dex is a powerful data visualization tool that enables data exploration and publishing of web visualizations.
Modern database IDE for dev & data workflows, supporting MySQL, PostgreSQL & MongoDB.
A comprehensive resource for developers to learn and get started with data engineering using Python.
A Python library for extracting, transforming, and loading tabular data.
Fastest open-source data pipeline tool for replicating databases to data lakes in Apache Iceberg format.
A Rust library that enables querying Excel spreadsheets using SQLite, making data extraction and analysis more efficient.
Educational notebooks on quantitative finance, algorithmic trading, financial modeling, and investment strategy.
A command-line tool for version controlling database snapshots, allowing developers to save, restore, and archive database state.
A Python driver for the ClickHouse database with native interface support.
LuxCore is a high-performance path-tracing render engine for realistic 3D graphics and visualization.
This is an astronomy visualization project that maps orbits of asteroids in the solar system.
A corpus of company names, abbreviations, and brands that can be used for Chinese text segmentation and entity recognition.
A Rust data structure for efficiently storing and accessing data in a sparse set.
ActiveRecord-like API for CoreData, a powerful object-relational mapping (ORM) for iOS development.
A PHP library that provides a MySQL backup functionality, similar to the mysqldump CLI tool.
Open Babel is a chemical toolbox for working with chemical data and cheminformatics.
The official C++ client API for PostgreSQL, providing a high-level interface for interacting with PostgreSQL databases.
A Python toolbox for seismology and seismological observatories, providing tools for data processing and analysis.
An exabyte-scale, multi-region distributed file system for developers building AI-powered applications.
A Python library for building business intelligence (BI) and OLAP solutions.
Python library for clustering categorical data using k-modes and k-prototypes algorithms.
A Python library for reading, manipulating, and writing data in various spreadsheet file formats.
A Java-based framework for building agile DataOps pipelines using tools like Flink, DataX, and Chunjun with a web UI.
Useful scripts, UDFs, views, and other utilities for migration and data warehouse operations in BigQuery.
A data science and machine learning library for Go, providing DataFrame functionality similar to Python's Pandas.
Get weekly updates on trending AI coding tools and projects.