Category
Showing 151-200 of 897 trending projects
Docker images containing Jupyter applications for data science and machine learning workflows.
A curated list of data engineering tools for software developers, not focused on AI coding tools.
Pentaho Data Integration (ETL) is a Java-based tool for building data integration and ETL pipelines.
A toolkit for SQLite databases, focused on application development with a Swift-based API.
A collection of Python code examples and tutorials for data science, machine learning, and web development.
Efficient in-memory cache in Go for storing and retrieving large amounts of data.
This is a collection of readings and resources related to databases, not a vibe coder platform.
Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy
Azure Data Studio is a data management and development tool with connectivity to popular cloud and on-premises databases.
An open-source PostgreSQL client application for macOS, providing an easy way to set up and manage a local PostgreSQL database.
A lightweight, document-oriented database optimized for happiness, used as a Python library or CLI.
The versioned, forkable, syncable database for developers who need a scalable, distributed data solution.
Open-source relational database management system (RDBMS) for building data-driven applications.
AlaSQL is a JavaScript SQL database for browser and Node.js that handles both relational tables and nested JSON data.
Records is a Python SQL library that makes interacting with databases more intuitive and human-friendly.
An educational distributed SQL database written in Rust, not focused on AI coding tools.
A tutorial and implementation of a disease-centered medical knowledge graph and QA system.
Draco is a C++ library for compressing and decompressing 3D geometric meshes and point clouds.
Alluxio is an open-source data orchestration platform for analytics and machine learning workloads in the cloud.
This is a comprehensive financial database with 300,000+ symbols including equities, currencies, and cryptocurrencies.
A comprehensive set of Python notes and resources for developers, covering a wide range of topics including data science, machine learning, and scientific computing.
Pandas Cookbook is a collection of recipes for using Python's powerful data analysis library, Pandas.
A powerful customer data pipeline for collecting, processing, and analyzing user events and behavior.
SQLBoiler is a Go ORM that generates code tailored to your database schema, making it easy to interact with databases.
An open-source, scalable, and fault-tolerant NoSQL database with a focus on reliability and offline-first design.
Portfolio analytics library for quantitative finance, built with Python
Database manager for multiple database engines, runs as desktop or web app.
SQLDelight - Generates type-safe Kotlin APIs from SQL, enabling easier database management in Kotlin projects.
A collection of notebooks covering quantitative finance and numerical methods in Python.
A collection of data analysis and machine learning projects and resources for developers.
Zeppelin is a web-based notebook that enables data-driven, interactive data analytics and collaborative documents.
Hazelcast is a high-performance, distributed in-memory data platform for real-time insights and stream processing.
A free, open-source SQLite database manager for multiple platforms.
A curated list of awesome R packages, frameworks and software for data analysis and data science.
Comprehensive collection of city and administrative region data for China, with features like CSV export, JS code generation, and web scraping.
Flink CDC is a streaming data integration tool that enables real-time data pipelines and change data capture.
A suite of utilities for converting to and working with CSV, the king of tabular file formats.
LevelDB key/value database in Go for building high-performance data-intensive applications.
Pachyderm is a data-centric pipeline and data versioning platform for building and scaling data-intensive applications.
Apache Pinot is a realtime distributed OLAP datastore for fast querying of large datasets.
Open-source, cloud-native, unified observability database for metrics, logs and traces, supporting SQL/PromQL/Streaming.
Apache Hive is a data warehouse software built on top of Apache Hadoop for querying and managing large datasets.
No description provided for this medical data repository.
A curated list of free/public domain text datasets for natural language processing (NLP) tasks.
GDAL is an open-source library for working with various geospatial data formats, useful for remote sensing and GIS applications.
A powerful, multi-database ORM for .NET that supports a wide range of SQL databases and provides a seamless data access layer.
AliSQL is a MySQL branch originated from Alibaba Group, focused on high performance and scalability.
KurrentDB is an event-native database designed for modern software and event-driven architectures.
Immutable database and Datalog query engine for Clojure, ClojureScript and JS
Get weekly updates on trending AI coding tools and projects.