Trending Projects

Discover the fastest growing open source projects

Showing 351-400 of 897 trending projects

#351
TobikoData/sqlmesh

Scalable and efficient data transformation framework with backwards compatibility for dbt.

0
0.0%
2.9K
total stars
#352
hosseinmoein/DataFrame

C++ DataFrame library for statistical, financial, and machine learning analysis.

0
0.0%
2.9K
total stars
#353
huggingface/datatrove

A Python library that provides a set of customizable pipeline processing blocks for data processing tasks.

0
0.0%
2.9K
total stars
#354
arpanghosh8453/garmin-grafana

A Python script to fetch Garmin health data and populate it in an InfluxDB database for visualization in Grafana.

0
0.0%
2.9K
total stars
#355
timescale/pgvectorscale

A Postgres extension for high-performance vector search, complementing pgvector for scale.

0
0.0%
2.9K
total stars
#356
kayak/pypika

PyPika is a Python SQL query builder that provides a readable, Pythonic syntax for constructing complex SQL queries.

0
0.0%
2.9K
total stars
#357
apache/gravitino

An open-source data catalog platform for building a high-performance, federated metadata lake.

0
0.0%
2.9K
total stars
#358
ekzhu/datasketch

A Python library for data sketching techniques like MinHash, LSH, HyperLogLog, and HNSW for approximate similarity search.

0
0.0%
2.9K
total stars
#359
orbitinghail/sqlsync

Collaborative offline-first SQLite wrapper for syncing app state across users & devices

0
0.0%
2.9K
total stars
#360
vortex-data/vortex

An extensible, high-performance columnar file format for data storage and processing.

0
0.0%
2.8K
total stars
#361
wesm/feather

Feather is a fast, interoperable binary data frame storage for Python, R, and more powered by Apache Arrow.

0
0.0%
2.8K
total stars
#362
HouzuoGuo/tiedot

A basic document (NoSQL) database implementation in Go, suitable for small-scale projects.

0
0.0%
2.7K
total stars
#363
MakieOrg/Makie.jl

A powerful data visualization and plotting library for the Julia programming language.

0
0.0%
2.7K
total stars
#364
zemirco/json2csv

Convert JSON to CSV with column titles

0
0.0%
2.7K
total stars
#365
aditya-grover/node2vec

This Scala library provides a high-performance implementation of the node2vec algorithm for embedding graphs.

0
0.0%
2.7K
total stars
#366
mourner/rbush

RBush is a high-performance JavaScript R-tree-based 2D spatial index for points and rectangles.

0
0.0%
2.7K
total stars
#367
chdb-io/chdb

An in-process OLAP SQL Engine powered by ClickHouse, enabling fast and efficient data analysis.

0
0.0%
2.6K
total stars
#368
dolthub/go-mysql-server

A MySQL-compatible relational database with a storage agnostic query engine, implemented in Go.

0
0.0%
2.6K
total stars
#369
posit-dev/great-tables

A Python library for creating easy-to-use, visually appealing data tables and summaries.

0
0.0%
2.6K
total stars
#370
aimeos/upscheme

A database migration and schema management tool for PHP developers, supporting multiple database engines.

0
0.0%
2.6K
total stars
#371
Visualize-ML/Book6_First-Course-in-Data-Science

A book on data science, covering topics from basic math to machine learning using Python and Jupyter Notebooks.

0
0.0%
2.6K
total stars
#372
mpquant/MyTT

A Python library with most common stock market technical indicators, making it easy to implement quantitative finance and algorithmic trading.

0
0.0%
2.6K
total stars
#373
schematics/schematics

Python data structures library focused on serialization, deserialization, and validation of complex data schemas.

0
0.0%
2.6K
total stars
#374
facebook/mysql-5.6

This is Facebook's branch of the Oracle MySQL database, including the MyRocks storage engine.

0
0.0%
2.6K
total stars
#375
mdeff/fma

A dataset for music analysis and research, with support for deep learning and reproducible research.

0
0.0%
2.6K
total stars
#376
CamDavidsonPilon/lifelines

A Python library for survival analysis, useful for developers working with time-to-event data.

0
0.0%
2.6K
total stars
#377
veb-101/Data-Science-Projects

A collection of data science projects in Python using Jupyter Notebook.

0
0.0%
2.6K
total stars
#378
GanjinZero/awesome_Chinese_medical_NLP

A curated collection of open-source Chinese medical NLP resources including datasets, models, and more.

0
0.0%
2.5K
total stars
#379
FeatureBaseDB/featurebase

FeatureBase is a fast analytical database built on bitmaps, perfect for ML and data-intensive applications.

0
0.0%
2.5K
total stars
#380
dblalock/bolt

A fast C++ library for high-performance matrix and vector operations.

0
0.0%
2.5K
total stars
#381
colour-science/colour

A comprehensive Python library for color science and color space conversions.

0
0.0%
2.5K
total stars
#382
rilldata/rill

Rill is a tool for transforming data sets into powerful dashboards using SQL, enabling BI-as-code.

0
0.0%
2.5K
total stars
#383
sfikas/medical-imaging-datasets

A collection of medical imaging datasets for researchers and developers in the healthcare industry.

0
0.0%
2.5K
total stars
#384
duckdb/ducklake

DuckLake is an integrated data lake and catalog format written in C++.

0
0.0%
2.5K
total stars
#385
The-Japan-DataScientist-Society/100knocks-preprocess

A repository for the 100 Knocks of Data Science Preprocessing, focused on structured data processing.

0
0.0%
2.5K
total stars
#386
eddwebster/football_analytics

A collection of football analytics projects, data, and analysis by Edd Webster (@eddwebster).

0
0.0%
2.5K
total stars
#387
EntilZha/PyFunctional

A Python library for creating data processing pipelines using functional programming principles.

0
0.0%
2.5K
total stars
#388
hardikkamboj/An-Introduction-to-Statistical-Learning

This repository provides Python implementations of exercises from the book 'An Introduction to Statistical Learning'.

0
0.0%
2.5K
total stars
#389
garden-co/jazz

A distributed database with CRDT sync, offline support, and end-to-end encryption for vibe coders.

0
0.0%
2.5K
total stars
#390
griddb/griddb

GridDB is a fast and scalable open-source database for time-series IoT and big data applications.

0
0.0%
2.5K
total stars
#391
nicolaspanel/numjs

A JavaScript library that provides a NumPy-like interface for working with multi-dimensional arrays and matrices.

0
0.0%
2.5K
total stars
#392
neilotoole/sq

sq is a Go-based data wrangling tool that supports a variety of data formats and databases.

0
0.0%
2.5K
total stars
#393
lerocha/chinook-database

Sample database for SQL Server, Oracle, MySQL, PostgreSQL, SQLite, DB2

0
0.0%
2.5K
total stars
#394
geekyouth/SZT-bigdata

This is a big data analysis system for the Shenzhen metro with support for various data processing tools.

0
0.0%
2.4K
total stars
#395
armink/FlashDB

An ultra-lightweight database that supports key-value and time series data for embedded and IoT applications.

0
0.0%
2.4K
total stars
#396
oceanbase/seekdb

AI-native database unifying vector, text, and structured data for hybrid search and in-database AI workflows.

0
0.0%
2.4K
total stars
#397
reiinakano/scikit-plot

An intuitive Python library that adds plotting functionality to scikit-learn machine learning models

0
0.0%
2.4K
total stars
#398
PizzaDeDados/datascience-pizza

A repository for collecting study materials and resources related to data analysis and related fields.

0
0.0%
2.4K
total stars
#399
benedekrozemberczki/awesome-community-detection

A curated list of community detection research papers with implementations for data science and network analysis.

0
0.0%
2.4K
total stars
#400
lukes/ISO-3166-Countries-with-Regional-Codes

A comprehensive dataset of ISO 3166-1 country codes and their corresponding UN Geoscheme regional codes, ready to use in various formats.

0
0.0%
2.4K
total stars
1...79...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.