Trending Projects

Discover the fastest growing open source projects

Showing 701-750 of 897 trending projects

#701
tensorchord/pgvecto.rs

Scalable, low-latency vector search in Postgres, revolutionizing vector search and databases.

+26
+1.2%
2.2K
total stars
#702
awslabs/open-data-registry

A registry of publicly available datasets hosted on AWS for data-driven developers.

+26
+1.6%
1.7K
total stars
#703
realm/realm-core

Core database component for the Realm Mobile Database SDKs, a popular NoSQL database for mobile apps.

+26
+2.5%
1.0K
total stars
#704
bashtage/linearmodels

This Python library provides additional linear models for statistical modeling and analysis.

+26
+2.6%
1.0K
total stars
#705
olric-data/olric

Olric is a distributed, in-memory key/value store and cache for Go applications and services.

+25
+0.7%
3.4K
total stars
#706
chaisql/chai

A modern, embedded SQL database written in Go for embedded and mobile applications.

+25
+1.5%
1.7K
total stars
#707
substrait-io/substrait

A cross-platform way to express data transformation, relational algebra, and standardized record expression and plans.

+25
+1.7%
1.5K
total stars
#708
zinggAI/zingg

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

+25
+2.2%
1.2K
total stars
#709
apache/amoro

Apache Amoro is an open-source Lakehouse management system built on big data formats like Flink, Hudi, and Iceberg.

+25
+2.3%
1.1K
total stars
#710
Alluxio/alluxio

Alluxio is an open-source data orchestration platform for analytics and machine learning workloads in the cloud.

+24
+0.3%
7.2K
total stars
#711
indradb/indradb

A Rust-based graph database for developers who need to store and query connected data.

+24
+1.0%
2.4K
total stars
#712
LastAncientOne/Stock_Analysis_For_Quant

A collection of stock analysis tools across various programming languages and platforms.

+24
+1.2%
2.0K
total stars
#713
google/tensorstore

A C++ library for reading and writing large multi-dimensional arrays, useful for scientific and data-intensive applications.

+24
+1.6%
1.5K
total stars
#714
ravendb/ravendb

A highly scalable, distributed, document-oriented NoSQL database with full-text search, spatial, and time-series support.

+23
+0.6%
3.9K
total stars
#715
konradhalas/dacite

A simple Python library for creating dataclasses from dictionaries.

+23
+1.2%
2.0K
total stars
#716
pachterlab/gget

gget is a Python library that enables efficient querying of genomic reference databases like NCBI, Ensembl, and UniProt.

+23
+2.1%
1.1K
total stars
#717
caserec/Datasets-for-Recommender-Systems

A high-quality dataset repository for building recommender systems, useful for vibe coders working on AI-powered applications.

+23
+2.1%
1.1K
total stars
#718
lijin-THU/notes-python

A comprehensive set of Python notes and resources for developers, covering a wide range of topics including data science, machine learning, and scientific computing.

+22
+0.3%
7.1K
total stars
#719
oceanbase/seekdb

AI-native database unifying vector, text, and structured data for hybrid search and in-database AI workflows.

+22
+0.9%
2.4K
total stars
#720
babyfish-ct/jimmer

An advanced ORM library for Java and Kotlin developers that provides powerful caching and data management features.

+22
+1.4%
1.6K
total stars
#721
mono/taglib-sharp

A C# library for reading and writing metadata in media files, useful for audio and video processing applications.

+22
+1.6%
1.4K
total stars
#722
lmmentel/awesome-python-chemistry

A curated list of Python packages for chemistry, including computational chemistry, molecular dynamics, and quantum chemistry.

+22
+1.6%
1.4K
total stars
#723
orientechnologies/orientdb

OrientDB is a versatile, multi-model DBMS that supports Graph, Document, Reactive, Full-Text, and Geospatial models.

+21
+0.4%
4.9K
total stars
#724
fluid-cloudnative/fluid

Fluid is a distributed data abstraction and acceleration framework for Big Data and AI applications on the cloud.

+21
+1.1%
1.9K
total stars
#725
reata/sqllineage

SQL Lineage Analysis Tool that provides data discovery and governance insights through Python.

+21
+1.3%
1.6K
total stars
#726
aws-samples/aws-glue-samples

AWS Glue code samples for building data integration and ETL pipelines on AWS.

+21
+1.4%
1.5K
total stars
#727
apache/cloudberry

Open-source massively parallel processing (MPP) database, an alternative to Greenplum.

+21
+1.8%
1.2K
total stars
#728
DaveSkender/Stock.Indicators

A C# NuGet package that provides technical indicators and trading insights for financial market data analysis.

+21
+1.8%
1.2K
total stars
#729
crate/crate

CrateDB is a distributed, scalable SQL database for storing and analyzing massive amounts of data in near real-time.

+20
+0.5%
4.4K
total stars
#730
google/youtube-8m

Starter code for working with the YouTube-8M dataset, a large-scale video understanding dataset.

+20
+0.8%
2.4K
total stars
#731
h5py/h5py

A Python library for accessing the HDF5 binary data format, a popular format for scientific and numerical data.

+20
+0.9%
2.2K
total stars
#732
sfirke/janitor

A collection of simple tools for data cleaning and wrangling in R for data science tasks.

+20
+1.4%
1.4K
total stars
#733
wx-chevalier/Database-Notes

A comprehensive collection of notes and resources for understanding different database technologies and concepts.

+20
+1.5%
1.4K
total stars
#734
PDAL/PDAL

PDAL is a C++ library for processing point cloud data, similar to GDAL for raster data.

+20
+1.5%
1.3K
total stars
#735
microsoft/Trill

Trill is a single-node query processor for temporal or streaming data.

+20
+1.6%
1.3K
total stars
#736
dotnetcore/FreeSql

An ORM (Object-Relational Mapping) library for .NET that supports a wide range of database providers, including SQL Server, MySQL, PostgreSQL, and more.

+19
+0.4%
4.4K
total stars
#737
paulvangentcom/heartrate_analysis_python

A Python package for analyzing heart rate data from PPG and ECG signals.

+19
+1.8%
1.1K
total stars
#738
modin-project/modin

Modin: Scalable Pandas workflows with a single line of code change, enabling distributed data processing.

+18
+0.2%
10.4K
total stars
#739
niderhoff/nlp-datasets

A curated list of free/public domain text datasets for natural language processing (NLP) tasks.

+18
+0.3%
6.0K
total stars
#740
isar/isar

Extremely fast, easy to use, and fully async NoSQL database for Flutter apps

+18
+0.5%
4.0K
total stars
#741
mdeff/fma

A dataset for music analysis and research, with support for deep learning and reproducible research.

+18
+0.7%
2.6K
total stars
#742
huandu/go-sqlbuilder

A flexible and powerful SQL string builder library plus a zero-config ORM for Go developers.

+18
+1.1%
1.7K
total stars
#743
bashtage/arch

A comprehensive Python library for modeling and forecasting financial time series data using ARCH models.

+18
+1.2%
1.5K
total stars
#744
crazyhottommy/getting-started-with-genomics-tools-and-resources

A collection of Unix, R, and Python tools for bioinformatics and data science projects.

+18
+1.3%
1.4K
total stars
#745
marcboeker/gmail-to-sqlite

Index your Gmail account to a SQLite DB and perform custom data analysis on your email.

+18
+1.5%
1.2K
total stars
#746
kelvins/municipios-brasileiros

A Python library with data related to Brazilian municipalities, including IBGE codes, latitude, longitude, and more.

+18
+1.5%
1.2K
total stars
#747
tidwall/btree

A high-performance B-tree implementation for Go, useful for building database-like applications.

+18
+1.5%
1.2K
total stars
#748
patx/pickledb

An in-memory key-value store using Python's orjson module for persistence, with SQLite support.

+18
+1.7%
1.1K
total stars
#749
IndrajeetPatil/ggstatsplot

ggstatsplot is an R library that enhances ggplot2 visualizations with statistical analysis and hypothesis testing.

+17
+0.8%
2.2K
total stars
#750
vaastav/Fantasy-Premier-League

A Python script that generates a CSV file with data about players in the English Premier League Fantasy League.

+17
+1.0%
1.7K
total stars
1...1416...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.