Trending Projects

Discover the fastest growing open source projects

Showing 401-450 of 897 trending projects

#401

indradb/indradb

A Rust-based graph database for developers who need to store and query connected data.

0.0%

2.4K

total stars

Rust

#402

jayinai/data-science-question-answer

A collection of data science related questions and answers for developers.

0.0%

2.4K

total stars

Jupyter Notebook

#403

apache/hamilton

Hamilton is an open-source ETL framework that helps data scientists and engineers build modular, testable dataflows with lineage and metadata.

0.0%

2.4K

total stars

Jupyter Notebook

#404

malloydata/malloy

Malloy is an open-source language for describing data relationships and transformations.

0.0%

2.4K

total stars

TypeScript

#405

youngyangyang04/Skiplist-CPP

A lightweight key-value store built with C++ using a skiplist data structure.

0.0%

2.4K

total stars

C++

#406

pydata/numexpr

A fast numerical array expression evaluator for Python, NumPy, Pandas, PyTables and more.

0.0%

2.4K

total stars

Python

#407

meltano/meltano

Meltano is a declarative, code-first data integration engine for building and scaling data and ML-powered products.

0.0%

2.4K

total stars

Python

#408

google/youtube-8m

Starter code for working with the YouTube-8M dataset, a large-scale video understanding dataset.

0.0%

2.4K

total stars

Python

#409

quarylabs/quary

Open-source BI platform for engineers to explore and model large-scale data pipelines.

0.0%

2.4K

total stars

Rust

#410

emirozer/fake2db

A Python library that generates fake data for custom test databases.

0.0%

2.4K

total stars

Python

#411

PyWavelets/pywt

PyWavelets is a Python library for wavelet transform algorithms and techniques, useful for image and signal processing.

0.0%

2.3K

total stars

Python

#412

orbitjs/orbit

A composable data framework for building ambitious web applications using TypeScript.

0.0%

2.3K

total stars

TypeScript

#413

VictoriaMetrics/fastcache

Fast in-memory cache library for Go with low GC overhead, optimized for a large number of entries.

0.0%

2.3K

total stars

#414

JasonKessler/scattertext

A Python library for creating beautiful visualizations of language differences across document types.

0.0%

2.3K

total stars

Python

#415

chezou/tabula-py

A simple Python wrapper for the Tabula Java library, which extracts tables from PDF files into Pandas DataFrames.

0.0%

2.3K

total stars

Python

#416

apache/parquet-format

Apache Parquet Format, a columnar data storage format used in the Apache Hadoop ecosystem.

0.0%

2.3K

total stars

Thrift

#417

binance/binance-public-data

A Python library to access historical market data from the Binance cryptocurrency exchange.

0.0%

2.3K

total stars

Python

#418

h5py/h5py

A Python library for accessing the HDF5 binary data format, a popular format for scientific and numerical data.

0.0%

2.2K

total stars

Python

#419

man-group/ArcticDB

ArcticDB is a high-performance, serverless DataFrame database for the Python data science ecosystem.

0.0%

2.2K

total stars

C++

#420

supabase/etl

A real-time Postgres data replication and streaming library built in Rust for building CDC pipelines.

0.0%

2.2K

total stars

Rust

#421

BlankerL/DXY-COVID-19-Data

A data warehouse for COVID-19 time series data, useful for data analysis and visualization.

0.0%

2.2K

total stars

Python

#422

IndrajeetPatil/ggstatsplot

ggstatsplot is an R library that enhances ggplot2 visualizations with statistical analysis and hypothesis testing.

0.0%

2.2K

total stars

#423

tensorchord/pgvecto.rs

Scalable, low-latency vector search in Postgres, revolutionizing vector search and databases.

0.0%

2.2K

total stars

Rust

#424

timeplus-io/proton

Fast, single-binary C++ SQL ETL pipeline for stream processing, observability, analytics, and AI/ML.

0.0%

2.2K

total stars

C++

#425

ngaut/builddatabase

A distributed SQL database built from scratch, not focused on vibe coders or AI tools.

0.0%

2.1K

total stars

#426

RJT1990/pyflux

Open source time series library for Python, useful for statistical analysis and modeling.

0.0%

2.1K

total stars

Python

#427

fugue-project/fugue

A unified interface for distributed computing on Spark, Dask and Ray without any rewrites.

0.0%

2.1K

total stars

Python

#428

Jon-Becker/prediction-market-analysis

Framework for collecting and analyzing prediction market data with comprehensive Polymarket/Kalshi datasets.

0.0%

2.1K

total stars

Python

#429

konradhalas/dacite

A simple Python library for creating dataclasses from dictionaries.

0.0%

2.0K

total stars

Python

#430

chris1610/pbpython

A collection of Python code, notebooks, and examples for practical business data analysis and visualization.

0.0%

2.0K

total stars

Jupyter Notebook

#431

moj-analytical-services/splink

Fast, accurate, and scalable probabilistic data linkage with support for multiple SQL backends.

0.0%

2.0K

total stars

Python

#432

soedinglab/MMseqs2

MMseqs2 is an ultra-fast and sensitive bioinformatics tool for sequence search and clustering.

0.0%

2.0K

total stars

#433

apache/bookkeeper

Apache BookKeeper is a scalable, fault tolerant and low latency storage service optimized for append-only workloads.

0.0%

2.0K

total stars

Java

#434

zhu-xlab/GlobalBuildingAtlas

GlobalBuildingAtlas is an open global and complete dataset of building polygons, heights and LoD1 3D models.

0.0%

2.0K

total stars

Python

#435

apache/datafusion-ballista

Apache DataFusion Ballista is a distributed query engine for big data analysis, built with Rust and Arrow.

0.0%

2.0K

total stars

Rust

#436

alibaba/clusterdata

A dataset of cluster data collected from Alibaba's production clusters for cluster management research.

0.0%

2.0K

total stars

Jupyter Notebook

#437

mysql2sqlite/mysql2sqlite

Converts MySQL database dumps to SQLite3 compatible formats for easier migration and data portability.

0.0%

2.0K

total stars

Awk

#438

LastAncientOne/Stock_Analysis_For_Quant

A collection of stock analysis tools across various programming languages and platforms.

0.0%

2.0K

total stars

Jupyter Notebook

#439

shancarter/mr-data-converter

A JavaScript library that converts CSV and tab-delimited data to web-friendly formats like JSON and XML.

0.0%

2.0K

total stars

JavaScript

#440

bytewax/bytewax

Bytewax is a Python library for building scalable, fault-tolerant, and low-latency data processing pipelines.

0.0%

2.0K

total stars

Python

#441

igraph/igraph

A powerful C library for analyzing complex networks and graph-based data structures.

0.0%

1.9K

total stars

#442

JuliaPlots/Plots.jl

Powerful plotting and data visualization library for the Julia programming language.

0.0%

1.9K

total stars

Julia

#443

zarr-developers/zarr-python

An efficient and compressed N-dimensional array library for Python, useful for data scientists and ML engineers.

0.0%

1.9K

total stars

Python

#444

openacid/slim

A space-efficient trie data structure in Go with fast lookup performance.

0.0%

1.9K

total stars

#445

eveningkid/denodb

A versatile ORM for multiple databases including MySQL, SQLite, MariaDB, PostgreSQL, and MongoDB in Deno.

0.0%

1.9K

total stars

TypeScript

#446

duckdb/duckdb-wasm

WebAssembly version of the DuckDB analytical database, enabling fast in-browser analytics and SQL queries.

0.0%

1.9K

total stars

C++

#447

brimdata/zui

Zui is a powerful desktop app for exploring and working with data, with support for CSV, JSON, and the Zed data format.

0.0%

1.9K

total stars

TypeScript

#448

mirage/irmin

Irmin is a distributed database that follows the same design principles as Git, allowing for distributed version control of data.

0.0%

1.9K

total stars

OCaml

#449

enhancedformysql/The-Art-of-Problem-Solving-in-Software-Engineering_How-to-Make-MySQL-Better

This repository provides a comprehensive guide on optimizing MySQL performance and solving common database problems.

0.0%

1.9K

total stars

#450

fjall-rs/fjall

A high-performance, embeddable key-value storage engine written in Rust for developers building data-intensive applications.

0.0%

1.9K

total stars

Rust

1...810...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.