Category
Showing 751-800 of 897 trending projects
An R package for training and plotting classification and regression models.
A blazingly fast analytics database built with Rust, optimized for rapidly devouring large amounts of data.
An open-source COVID-19 dashboard powered by the fastpages framework, featuring data visualizations.
A framework-agnostic, datastore-agnostic JavaScript ORM built for ease of use and peace of mind.
Concurrent data pipelines in Python for building efficient and scalable data processing workflows.
A data quality and observability tool for monitoring and fixing data issues before they become problems.
A Go ORM and query builder for interacting with databases in Go applications.
HiBench is a big data benchmark suite for evaluating the performance of different big data frameworks.
AgensGraph is a transactional graph database based on PostgreSQL for enterprise-level applications.
TensorBase is a new big data warehousing solution built with Rust, focused on high-performance analytics.
QueryKit is a simple CoreData query language for Swift and Objective-C developers.
CarbonData is a high-performance data store solution for big data analytics on Hadoop and Spark.
A distributed, Redis-compatible NoSQL database that provides high performance and scalability.
This repository provides code examples for Oracle's AI-enabled database features and integrations.
Quilt is a data mesh for connecting people with actionable data, built with TypeScript.
Modern database IDE for dev & data workflows, supporting MySQL, PostgreSQL & MongoDB.
A comprehensive resource for developers to learn and get started with data engineering using Python.
A corpus of company names, abbreviations, and brands that can be used for Chinese text segmentation and entity recognition.
ActiveRecord-like API for CoreData, a powerful object-relational mapping (ORM) for iOS development.
Embedded Go Database, a fast open-source NoSQL database solution for Go projects.
A scalable, SQL-based streaming analytics platform from Uber, built on top of Apache Flink.
A C++ library for processing data streams, potentially useful for vibe coders working with AI-powered tools.
db.py is a Python library that provides an easier way to interact with your databases.
An automatic database ORM library for Objective-C that provides thread-safe and deadlock-free database operations.
Java client library for connecting to the InfluxDB time series database.
A Python data analysis library optimized for humans instead of machines.
Python library for using dplyr-like syntax with pandas and SQL databases
Distributed, massively parallel SQL query engine for big data analytics and timeseries workloads.
A Python library that summarizes news articles by extracting the most important sentences.
This GitHub repository provides time series data on COVID-19 cases, useful for data analysis and visualization.
A collection of open data sets and tools for data science and machine learning tasks.
A Swiss army knife for big data, enabling seamless integration with popular data warehousing solutions.
An open-source platform for building and sharing datasets, focused on trust, privacy, and decentralization.
Connect processes into powerful data pipelines with a simple git-like filesystem interface
TrailDB is an efficient database for storing and querying series of events.
Eloquent ORM for Java 8, 11, 17, 21, 23 and Spring boot 2.x, 3.x
This repository provides a comprehensive guide and implementations for data algorithms using MapReduce, Spark, Java, and Scala.
RRDtool is a time-series database system for efficiently storing and graphing data.
A repository containing various NLP datasets collected and organized by the owner.
Core database component for the Realm Mobile Database SDKs, a popular NoSQL database for mobile apps.
A geospatial data library for Ruby that provides a set of tools for working with geographic data.
A Python helper library for enhancing Jupyter Notebooks with data visualization and analysis capabilities.
Tools to download and cleanup Common Crawl data, a large web crawl dataset, for further analysis and processing.
A library of functional, durable data structures written in Java for developers building robust applications.
A Python library for searching and downloading Copernicus Sentinel satellite images for geographic data analysis.
A space-efficient C++ implementation of the Cuckoo filter, a probabilistic data structure for set membership testing.
SciRuby provides a collection of tools for scientific computation in Ruby, catering to developers working with data and scientific applications.
Pachyderm is a data-centric pipeline and data versioning platform for building and scaling data-intensive applications.
Automatically visualize your pandas dataframes with a single print command, enabling quick EDA.
Get weekly updates on trending AI coding tools and projects.