Category
Showing 101-150 of 897 trending projects
dvc is a data versioning and ML experiments tool that helps developers manage and track data and model changes.
A comprehensive database of countries, states, and cities with data in multiple formats
A Python library for quantitative trading and stock analysis.
Open-source, cloud-native, unified observability database for metrics, logs and traces, supporting SQL/PromQL/Streaming.
Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy
A curated list of data science interview questions and answers for developers.
ORM for TypeScript and JavaScript with support for multiple databases and platforms.
Apache Flink is a stream processing framework for real-time and batch data processing.
A high-performance NoSQL data store compatible with Apache Cassandra and Amazon DynamoDB.
PRQL is a modern, powerful, and pipelined SQL replacement for transforming data.
An open-source multi-tool for exploring and publishing data, focused on simplifying data analysis and sharing.
A Python package for accessing and analyzing Formula 1 racing data, including results, schedules, timing, and telemetry.
DuckLake is an integrated data lake and catalog format written in C++.
A comprehensive list of learning materials to help developers understand database internals.
A curated list of awesome big data frameworks, resources and other awesomeness.
Dexie.js is a minimalistic IndexedDB wrapper that simplifies offline storage and database management in web applications.
A Python library for downloading, parsing, and analyzing health data from Garmin, FitBit, and MS Health.
GlobalBuildingAtlas is an open global and complete dataset of building polygons, heights and LoD1 3D models.
Fast, embeddable key-value database written in Go for building high-performance storage applications.
Fast, cost-effective data replication tool from Postgres to data warehouses, queues, and storage
Argo Workflows is a powerful open-source workflow engine for Kubernetes, enabling complex data processing and machine learning pipelines.
A modular quantitative trading framework for algorithmic trading, backtesting, and financial analysis.
Blazing-fast data wrangling toolkit for AI and data engineering workflows
Fast, accurate, and scalable probabilistic data linkage with support for multiple SQL backends.
A fast, lightweight SQLite-based persistence layer with CloudKit synchronization for Swift developers.
FoundationDB is an open-source, distributed, transactional key-value store that provides ACID guarantees.
A transactional, relational-graph-vector database that uses Datalog for query, designed for AI and ML use cases.
A distributed database with CRDT sync, offline support, and end-to-end encryption for vibe coders.
lakeFS is a Git-like version control system for data lakes, enabling data engineers to manage data versioning and data quality.
Nebula is a fast, open-source, distributed graph database with horizontal scalability and high availability.
A flexible and standardized cookiecutter template for doing and sharing data science work in Python.
A Python library for financial portfolio optimization, including classical efficient frontier and advanced techniques.
Database manager for multiple database engines, runs as desktop or web app.
Statsmodels is a Python library for statistical modeling and econometrics, providing tools for data analysis and prediction.
Unified cloud-native data warehouse platform for analytics, search and AI, built on top of S3 storage.
This repository contains code samples for SQL Server, Azure SQL, and related data services from Microsoft.
A repository of data science interview questions and answers for developers.
Reactive, local-first database for JavaScript apps with real-time sync and flexible storage
A lightweight, fault-tolerant distributed database built on SQLite, designed for high availability.
A lightweight SQLite3 driver for Go that implements the database/sql interface.
A collection of data science projects in Python using Jupyter Notebook.
A simple Windows desktop app for viewing and querying Apache Parquet files, a popular big data format.
A tutorial for writing a SQLite clone from scratch in C, a useful resource for developers building database-backed applications.
A high-quality, cross-platform data plotting library for Rust developers, including WebAssembly support.
Scalable and efficient data transformation framework with backwards compatibility for dbt.
Kibana is an open-source data visualization and management tool for Elasticsearch
An open-source data lakehouse framework that enables building data pipelines with leading big data compute engines.
Get weekly updates on trending AI coding tools and projects.