Data & Databases

ORMs, query builders, databases, and data pipelines

Showing 1201-1220 of 5,250 projects

camelot-dev/camelot

A Python library for extracting tabular data from PDF files, useful for data processing and analysis.

3.6K
Active
Python
Databases
API Frameworks
#pdf#data-extraction#tabular-data

rsnapshot/rsnapshot

rsnapshot is a Perl-based tool for backing up data using rsync, useful for developers who need reliable backups.

3.6K
Experimental
Perl
CLI Tools
Realtime
#backup#rsync#cli

yangwohenmai/LSTM

A Python library for time series forecasting using LSTM neural networks.

3.6K
Archived
Python
Forecast
API Frameworks
#time-series#forecasting#lstm

conduktor/kafka-stack-docker-compose

A Docker Compose setup to quickly spin up a fully-featured Kafka stack for development and testing.

3.6K
Active
Shell
Realtime
Containerization
#kafka#docker#docker-compose

awslabs/deequ

Deequ is a Scala library for defining "unit tests for data" to measure data quality in large datasets.

3.6K
Active
Scala
ETL & Pipelines
Testing
Spark
#data-quality#unit-testing#apache-spark

mckinsey/vizro

Vizro is a low-code toolkit for building high-quality data visualization apps using Python and Plotly.

3.6K
Active
Python
Charts & Visualization
Databases
Plotly
#data-visualization#dashboards#plotly

go-jet/jet

A type-safe SQL builder with code generation and automatic query result data mapping for Go.

3.6K
Active
Go
API Frameworks
ORMs & Query Builders
#sql-builder#type-safe#code-generation

apache/singa

A distributed deep learning platform for building and deploying AI models at scale.

3.6K
Active
C++
ML Ops
API Frameworks
#deep-learning#distributed-computing#ai-models

Visualize-ML/Book5_Essentials-of-Probability-and-Statistics

A Jupyter Notebook-based resource covering the essentials of probability and statistics, relevant for machine learning.

3.6K
Stable
Jupyter Notebook
Machine Learning
Books & Guides
#probability#statistics#machine-learning

TimelyDataflow/timely-dataflow

A modular implementation of timely dataflow in Rust, a powerful system for building high-performance concurrent applications.

3.6K
Active
Rust
API Frameworks
Databases
#timely-dataflow#concurrency#performance

oliver006/redis_exporter

A Prometheus exporter for collecting Redis and Valkey metrics for monitoring and observability.

3.6K
Active
Go
Monitoring
Caching
#prometheus#redis#metrics

GeneralMills/pytrends

A Python library that provides a simple API for accessing Google Trends data.

3.6K
Archived
Python
API Clients & Testing
Caching
Python
#google-trends#data-analysis#api-client

google/zopfli

A high-quality compression library written in C for zlib/deflate compression.

3.6K
Archived
C++
API Frameworks
Caching
#compression#zlib#deflate

nmslib/nmslib

An efficient similarity search library and toolkit for evaluating k-NN methods in non-metric spaces.

3.6K
Active
C++
Computer Vision
Vector Databases
#k-nn#similarity-search#non-metric

restatedev/restate

Restate is a Rust-based platform for building resilient, fault-tolerant applications without the need for complex infrastructure.

3.6K
Active
Rust
API Frameworks
Databases
#async-await#distributed-systems#durable-execution

pytorch/text

A PyTorch-powered library for loading and processing text data for natural language processing tasks.

3.6K
Stable
Python
LLM Frameworks
Datasets
PyTorch
#nlp#data-loader#deep-learning

LinShunKang/MyPerf4J

High-performance Java APM tool powered by ASM for performance analysis and monitoring.

3.6K
Stable
Java
Monitoring
API Frameworks
#performance#monitoring#profiling

CosmosShadow/gptpdf

A Python library that uses GPT to parse and extract information from PDF documents.

3.6K
Experimental
Python
LLM Wrappers & SDKs
API Frameworks
Python
#pdf#parsing#extraction

nutsdb/nutsdb

A simple, fast, and embeddable key-value store written in Go that supports transactions and data structures.

3.6K
Active
Go
Databases
API Frameworks
#key-value-store#transactions#data-structures

jdorfman/awesome-json-datasets

A curated list of awesome JSON datasets that don't require authentication.

3.6K
Archived
JavaScript
Databases
Caching
#json#datasets#data
1...6062...263

Stay in the loop

Get weekly updates on trending AI coding tools and projects.