Data & Databases

ORMs, query builders, databases, and data pipelines

Showing 3441-3460 of 5,250 projects

projectnessie/nessie

Nessie is a transactional data catalog for data lakes that provides Git-like semantics and functionality.

1.4K
Active
Java
Databases
API Frameworks
#data-catalog#data-lakes#git-semantics

NVIDIA-NeMo/Curator

Scalable data pre processing and curation toolkit for Large Language Models (LLMs)

1.4K
Active
Python
Python
#data-curation#large-language-models#data-preparation

marian-nmt/marian

A high-performance C++ library for neural machine translation, with CUDA support for GPU acceleration.

1.4K
Archived
C++
LLM Frameworks
API Frameworks
#cuda#fast#gpu

yahoo-finance/yahoo-finance

A Python module to fetch stock data from Yahoo Finance API for financial analysis and trading applications.

1.4K
Archived
Python
API Clients & Testing
Databases
#finance#stock-data#yahoo-finance-api

Aceinna/gnss-ins-sim

Open-source GNSS + inertial navigation simulator for motion trajectory generation and sensor fusion.

1.4K
Archived
Python
Arduino & Embedded
Databases
Python
#gnss#gps#imu

lnx-search/lnx

A fast, reliable search database written in Rust without the AI hype.

1.4K
Stable
Rust
API Frameworks
Search
#search#database#rust

hashintel/hash

An open-source, multi-tenant, self-building knowledge graph for developers building with AI tools.

1.4K
Active
Rust
LLM Frameworks
GraphQL
Rust
#knowledge-graph#multi-tenant#self-building

mesos/spark

Lightning-fast cluster computing in Java, Scala and Python.

1.4K
Archived
Scala
API Frameworks
ORMs & Query Builders
Scala
#cluster-computing#big-data#distributed-systems

tidyverse/tidyr

tidyr is an R package that provides a set of functions to tidy messy data into a format suitable for analysis.

1.4K
Active
R
ETL & Pipelines
CLI Tools
#data-transformation#data-cleaning#tidy-data

AlexTheAnalyst/PortfolioProjects

This repository contains a collection of portfolio projects for a data analyst, not a developer discovery platform.

1.4K
Archived
Jupyter Notebook
Databases
ETL & Pipelines
#data-analysis#portfolio#tutorials

clipperhouse/gen

Type-driven code generation for Go, enabling powerful generic programming.

1.4K
Archived
Go
Build Tools
API Frameworks
Go
#code-generation#generics#go

mlfoundations/dclm

DataComp for Language Models is a library for training, evaluating, and deploying large language models.

1.4K
Stable
HTML
LLM Frameworks
API Frameworks
Next.js
#machine-learning#language-models#api-development

enewhuis/liquibook

A modern C++ order matching engine for building trading platforms and financial applications.

1.4K
Archived
C++
API Frameworks
Realtime
#order-matching#trading-engine#realtime

r-spatial/sf

An R package that provides support for simple features, a standardized way to encode spatial vector data.

1.4K
Active
R
Databases
CLI Tools
#gdal#geos#proj

apache/cassandra-python-driver

Python driver for Apache Cassandra, a distributed database management system.

1.4K
Stable
Python
API Frameworks
Databases
#cassandra#database#python

spatie/once

A magic memoization function in PHP that helps improve performance by caching function results.

1.4K
Active
PHP
API Frameworks
CLI Tools
PHP
#cache#memoization#performance

akka/alpakka-kafka

Alpakka Kafka connector - a Reactive Enterprise Integration library for Java and Scala, based on Reactive Streams and Akka.

1.4K
Stable
Scala
API Frameworks
Databases
Akka
#kafka#reactive-streams#akka-streams

business-science/free_r_tips

A free newsletter with bite-sized R-tips and code tutorials for data scientists and developers.

1.4K
Archived
HTML
Tutorials & Courses
Backend Frameworks
R
#data-science#r-tips#newsletter

allenai/dolma

A Python library and tools for generating and inspecting data for pre-training large language models (LLMs).

1.4K
Stable
Python
LLM Frameworks
Data Processing
Python
#large-language-models#data-processing#natural-language-processing

mauricio/postgresql-async

Async, Netty-based database drivers for PostgreSQL and MySQL, written in Scala.

1.4K
Archived
Scala
API Frameworks
Databases
#postgresql#mysql#async
1...172174...263

Stay in the loop

Get weekly updates on trending AI coding tools and projects.