Data & Databases

ORMs, query builders, databases, and data pipelines

Showing 5061-5080 of 5,250 projects

avehtari/BDA_py_demos

Provides Bayesian data analysis demos in Python for developers interested in probabilistic modeling.

1.0K
Stable
Jupyter Notebook
Databases
LLM Frameworks
Python
#bayesian-inference#mcmc#probabilistic-modeling

carrotsearch/hppc

High Performance Primitive Collections for Java, a library for working with Java collections efficiently.

1.0K
Experimental
Java
API Frameworks
General Utilities
#java#collections#performance

cryptonomex/graphene

This is a C++ library for building decentralized applications on the Graphene blockchain.

1.0K
Archived
C++
API Frameworks
Databases
#blockchain#decentralized#api

apache/celeborn

Apache Celeborn is a high-performance shuffle and spilled data service for big data applications.

1.0K
Active
Java
Caching
Realtime
#bigdata#shuffle#spark

supabase-community/copycat

A TypeScript library that generates deterministic fake data for seeding databases and testing.

1.0K
Archived
TypeScript
API Mocking
Databases
TypeScript
#database-seeding#deterministic#fake

thuml/TimesNet

TimesNet is an open-source library for temporal 2D-variation modeling and general time series analysis.

1.0K
Archived
ML Ops
Time Series
#time-series-analysis#temporal-modeling#open-source

facebookresearch/cc_net

Tools to download and cleanup Common Crawl data, a large web crawl dataset, for further analysis and processing.

1.0K
Archived
Python
ETL & Pipelines
CLI Tools
Python
#data-processing#web-crawling#data-cleanup

geoopt/geoopt

A library for Riemannian optimization methods with PyTorch, enabling efficient optimization on Riemannian manifolds.

1.0K
Active
Python
ML Ops
API Frameworks
PyTorch
#optimization#riemannian-geometry#riemannian-manifold

megvii-research/PETR

A multi-view 3D object detection and segmentation framework that uses position embedding transformation.

1.0K
Archived
Python
Computer Vision
API Frameworks
Python
#3d-object-detection#multi-view#computer-vision

Netflix/Priam

A Java library for backup/recovery, token management, and configuration management for Cassandra databases.

1.0K
Active
Java
Authentication
Databases
#cassandra#backup#recovery

LAStools/LAStools

This repository contains efficient tools for LiDAR processing, focused on working with point cloud data.

1.0K
Active
C++
Databases
CLI Tools
#lidar#point-cloud#data-processing

TIBCOSoftware/snappydata

SnappyData is a memory-optimized analytics database based on Apache Spark and Apache Geode, enabling real-time stream processing, transactions, and predictive analytics.

1.0K
Archived
Scala
Databases
API Frameworks
Spark
#analytics#memory-database#scale

unitedstates/congress

This is a Python project for collecting public data on the work of the US Congress, including legislation, amendments, and votes.

1.0K
Stable
Python
Backend Frameworks
API Frameworks
#government-data#legislation#congress

taynaud/python-louvain

A Python library for implementing the Louvain community detection algorithm on graphs.

1.0K
Archived
Python
Databases
CLI Tools
NetworkX
#community-detection#graph-analysis#networkx

Netflix/astyanax

A Java client library for the Cassandra distributed database, providing a high-level API for interacting with Cassandra.

1.0K
Stable
Java
API Frameworks
Databases
#cassandra#nosql#database

Anviking/Decodable

Decodable is a Swift library for unmarshalling JSON data into Swift models more efficiently.

1.0K
Archived
Swift
API Clients & Testing
Backend Frameworks
Swift
#json#swift#unmarshalling

SysCV/sam-pt

A Python library that extends the Segment Anything Model (SAM) to enable zero-shot video segmentation with point-based tracking.

1.0K
Archived
Python
Computer Vision
API Frameworks
Python
#interactive-video-segmentation#zero-shot-segmentation#video-instance-segmentation

krishnaik06/Interview-Prepartion-Data-Science

This repository contains interview preparation materials for data science and machine learning.

1.0K
Archived
Jupyter Notebook
Tutorials & Courses
Interview Prep
#data-science#machine-learning#interview-preparation

dandelionsllm/pandallm

Panda is an open-source large language model project for the Chinese language, aiming to drive innovation and collaboration in NLP.

1.0K
Archived
Python
LLM Frameworks
API Frameworks
Python
#nlp#large-language-model#open-source

jsyoon0823/TimeGAN

A codebase for Time-series Generative Adversarial Networks (TimeGAN), a deep learning model for time series data generation.

1.0K
Experimental
Jupyter Notebook
ML Ops
API Frameworks
#time-series#generative-adversarial-networks#deep-learning
1...253255...263

Stay in the loop

Get weekly updates on trending AI coding tools and projects.