Trending Projects

Discover the fastest growing open source projects

Showing 351-400 of 897 trending projects

#351

TobikoData/sqlmesh

Scalable and efficient data transformation framework with backwards compatibility for dbt.

0.0%

2.9K

total stars

Python

#352

hosseinmoein/DataFrame

C++ DataFrame library for statistical, financial, and machine learning analysis.

0.0%

2.9K

total stars

C++

#353

huggingface/datatrove

A Python library that provides a set of customizable pipeline processing blocks for data processing tasks.

0.0%

2.9K

total stars

Python

#354

arpanghosh8453/garmin-grafana

A Python script to fetch Garmin health data and populate it in an InfluxDB database for visualization in Grafana.

0.0%

2.9K

total stars

Python

#355

timescale/pgvectorscale

A Postgres extension for high-performance vector search, complementing pgvector for scale.

0.0%

2.9K

total stars

Rust

#356

kayak/pypika

PyPika is a Python SQL query builder that provides a readable, Pythonic syntax for constructing complex SQL queries.

0.0%

2.9K

total stars

Python

#357

apache/gravitino

An open-source data catalog platform for building a high-performance, federated metadata lake.

0.0%

2.9K

total stars

Java

#358

ekzhu/datasketch

A Python library for data sketching techniques like MinHash, LSH, HyperLogLog, and HNSW for approximate similarity search.

0.0%

2.9K

total stars

Python

#359

orbitinghail/sqlsync

Collaborative offline-first SQLite wrapper for syncing app state across users & devices

0.0%

2.9K

total stars

Rust

#360

vortex-data/vortex

An extensible, high-performance columnar file format for data storage and processing.

0.0%

2.8K

total stars

Rust

#361

wesm/feather

Feather is a fast, interoperable binary data frame storage for Python, R, and more powered by Apache Arrow.

0.0%

2.8K

total stars

JavaScript

#362

HouzuoGuo/tiedot

A basic document (NoSQL) database implementation in Go, suitable for small-scale projects.

0.0%

2.7K

total stars

#363

MakieOrg/Makie.jl

A powerful data visualization and plotting library for the Julia programming language.

0.0%

2.7K

total stars

Julia

#364

zemirco/json2csv

Convert JSON to CSV with column titles

0.0%

2.7K

total stars

JavaScript

#365

aditya-grover/node2vec

This Scala library provides a high-performance implementation of the node2vec algorithm for embedding graphs.

0.0%

2.7K

total stars

Scala

#366

mourner/rbush

RBush is a high-performance JavaScript R-tree-based 2D spatial index for points and rectangles.

0.0%

2.7K

total stars

JavaScript

#367

chdb-io/chdb

An in-process OLAP SQL Engine powered by ClickHouse, enabling fast and efficient data analysis.

0.0%

2.6K

total stars

C++

#368

dolthub/go-mysql-server

A MySQL-compatible relational database with a storage agnostic query engine, implemented in Go.

0.0%

2.6K

total stars

#369

posit-dev/great-tables

A Python library for creating easy-to-use, visually appealing data tables and summaries.

0.0%

2.6K

total stars

Python

#370

aimeos/upscheme

A database migration and schema management tool for PHP developers, supporting multiple database engines.

0.0%

2.6K

total stars

PHP

#371

Visualize-ML/Book6_First-Course-in-Data-Science

A book on data science, covering topics from basic math to machine learning using Python and Jupyter Notebooks.

0.0%

2.6K

total stars

Jupyter Notebook

#372

mpquant/MyTT

A Python library with most common stock market technical indicators, making it easy to implement quantitative finance and algorithmic trading.

0.0%

2.6K

total stars

Python

#373

schematics/schematics

Python data structures library focused on serialization, deserialization, and validation of complex data schemas.

0.0%

2.6K

total stars

Python

#374

facebook/mysql-5.6

This is Facebook's branch of the Oracle MySQL database, including the MyRocks storage engine.

0.0%

2.6K

total stars

C++

#375

mdeff/fma

A dataset for music analysis and research, with support for deep learning and reproducible research.

0.0%

2.6K

total stars

Jupyter Notebook

#376

CamDavidsonPilon/lifelines

A Python library for survival analysis, useful for developers working with time-to-event data.

0.0%

2.6K

total stars

Python

#377

veb-101/Data-Science-Projects

A collection of data science projects in Python using Jupyter Notebook.

0.0%

2.6K

total stars

Jupyter Notebook

#378

GanjinZero/awesome_Chinese_medical_NLP

A curated collection of open-source Chinese medical NLP resources including datasets, models, and more.

0.0%

2.5K

total stars

#379

FeatureBaseDB/featurebase

FeatureBase is a fast analytical database built on bitmaps, perfect for ML and data-intensive applications.

0.0%

2.5K

total stars

#380

dblalock/bolt

A fast C++ library for high-performance matrix and vector operations.

0.0%

2.5K

total stars

C++

#381

colour-science/colour

A comprehensive Python library for color science and color space conversions.

0.0%

2.5K

total stars

Python

#382

rilldata/rill

Rill is a tool for transforming data sets into powerful dashboards using SQL, enabling BI-as-code.

0.0%

2.5K

total stars

#383

sfikas/medical-imaging-datasets

A collection of medical imaging datasets for researchers and developers in the healthcare industry.

0.0%

2.5K

total stars

#384

duckdb/ducklake

DuckLake is an integrated data lake and catalog format written in C++.

0.0%

2.5K

total stars

C++

#385

The-Japan-DataScientist-Society/100knocks-preprocess

A repository for the 100 Knocks of Data Science Preprocessing, focused on structured data processing.

0.0%

2.5K

total stars

HTML

#386

eddwebster/football_analytics

A collection of football analytics projects, data, and analysis by Edd Webster (@eddwebster).

0.0%

2.5K

total stars

Jupyter Notebook

#387

EntilZha/PyFunctional

A Python library for creating data processing pipelines using functional programming principles.

0.0%

2.5K

total stars

Python

#388

hardikkamboj/An-Introduction-to-Statistical-Learning

This repository provides Python implementations of exercises from the book 'An Introduction to Statistical Learning'.

0.0%

2.5K

total stars

Jupyter Notebook

#389

garden-co/jazz

A distributed database with CRDT sync, offline support, and end-to-end encryption for vibe coders.

0.0%

2.5K

total stars

TypeScript

#390

griddb/griddb

GridDB is a fast and scalable open-source database for time-series IoT and big data applications.

0.0%

2.5K

total stars

C++

#391

nicolaspanel/numjs

A JavaScript library that provides a NumPy-like interface for working with multi-dimensional arrays and matrices.

0.0%

2.5K

total stars

JavaScript

#392

neilotoole/sq

sq is a Go-based data wrangling tool that supports a variety of data formats and databases.

0.0%

2.5K

total stars

#393

lerocha/chinook-database

Sample database for SQL Server, Oracle, MySQL, PostgreSQL, SQLite, DB2

0.0%

2.5K

total stars

TSQL

#394

geekyouth/SZT-bigdata

This is a big data analysis system for the Shenzhen metro with support for various data processing tools.

0.0%

2.4K

total stars

Scala

#395

armink/FlashDB

An ultra-lightweight database that supports key-value and time series data for embedded and IoT applications.

0.0%

2.4K

total stars

#396

oceanbase/seekdb

AI-native database unifying vector, text, and structured data for hybrid search and in-database AI workflows.

0.0%

2.4K

total stars

C++

#397

reiinakano/scikit-plot

An intuitive Python library that adds plotting functionality to scikit-learn machine learning models

0.0%

2.4K

total stars

Python

#398

PizzaDeDados/datascience-pizza

A repository for collecting study materials and resources related to data analysis and related fields.

0.0%

2.4K

total stars

#399

benedekrozemberczki/awesome-community-detection

A curated list of community detection research papers with implementations for data science and network analysis.

0.0%

2.4K

total stars

Python

#400

lukes/ISO-3166-Countries-with-Regional-Codes

A comprehensive dataset of ISO 3166-1 country codes and their corresponding UN Geoscheme regional codes, ready to use in various formats.

0.0%

2.4K

total stars

Ruby

1...79...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.