Trending Projects

Discover the fastest growing open source projects

Showing 251-300 of 897 trending projects

#251

oceanbase/seekdb

AI-native database unifying vector, text, and structured data for hybrid search and in-database AI workflows.

+22

+0.9%

2.4K

total stars

C++

#252

zarr-developers/zarr-python

An efficient and compressed N-dimensional array library for Python, useful for data scientists and ML engineers.

+22

+1.1%

1.9K

total stars

Python

#253

GreenmaskIO/greenmask

A Go-based tool for database anonymization and synthetic data generation to help with security, QA, and data masking.

+22

+1.4%

1.6K

total stars

#254

JoinQuant/jqdatasdk

A Python package for easy access to financial market data in China for quantitative finance and FinTech applications.

+22

+1.8%

1.2K

total stars

Python

#255

allegro/bigcache

Efficient in-memory cache in Go for storing and retrieving large amounts of data.

+21

+0.3%

8.1K

total stars

#256

xiangyuecn/AreaCity-JsSpider-StatsGov

Comprehensive collection of city and administrative region data for China, with features like CSV export, JS code generation, and web scraping.

+21

+0.3%

6.4K

total stars

JavaScript

#257

ron-rs/ron

A Rust library for serializing and deserializing data in the Rusty Object Notation (RON) format.

+21

+0.6%

3.9K

total stars

Rust

#258

MakieOrg/Makie.jl

A powerful data visualization and plotting library for the Julia programming language.

+21

+0.8%

2.7K

total stars

Julia

#259

soedinglab/MMseqs2

MMseqs2 is an ultra-fast and sensitive bioinformatics tool for sequence search and clustering.

+21

+1.1%

2.0K

total stars

#260

apache/datafusion-ballista

Apache DataFusion Ballista is a distributed query engine for big data analysis, built with Rust and Arrow.

+21

+1.1%

2.0K

total stars

Rust

#261

LastAncientOne/Stock_Analysis_For_Quant

A collection of stock analysis tools across various programming languages and platforms.

+21

+1.1%

2.0K

total stars

Jupyter Notebook

#262

data-engineering-community/data-engineering-wiki

A community-driven wiki for learning data engineering, covering topics like data modeling, pipelines, and databases.

+21

+1.1%

1.9K

total stars

CSS

#263

LibRaw/LibRaw

LibRaw is a C++ library for reading RAW image files from digital cameras.

+21

+1.5%

1.4K

total stars

C++

#264

projectnessie/nessie

Nessie is a transactional data catalog for data lakes that provides Git-like semantics and functionality.

+21

+1.5%

1.4K

total stars

Java

#265

erikgrinaker/toydb

An educational distributed SQL database written in Rust, not focused on AI coding tools.

+20

+0.3%

7.2K

total stars

Rust

#266

bukosabino/ta

Technical Analysis Library using Pandas and Numpy for financial data analysis and trading strategies.

+20

+0.4%

4.9K

total stars

Jupyter Notebook

#267

briatte/awesome-network-analysis

A curated list of awesome resources for network analysis and visualization, with a focus on R tools.

+20

+0.5%

4.0K

total stars

#268

gee-community/geemap

A Python package for interactive geospatial analysis and visualization with Google Earth Engine.

+20

+0.5%

3.9K

total stars

Python

#269

holistics/dbml

A database modeling language (DBML) that helps define and document database structures.

+20

+0.6%

3.5K

total stars

JavaScript

#270

apache/auron

The Auron accelerator framework leverages vectorized execution to speed up distributed computing on big data platforms like Spark.

+20

+1.2%

1.7K

total stars

Rust

#271

jldbc/pybaseball

A Python library for pulling current and historical baseball statistics, including Statcast, Baseball Reference, and FanGraphs data.

+20

+1.3%

1.6K

total stars

Python

#272

pgvector/pgvector-python

A Python library that provides support for the pgvector vector database, enabling efficient vector search and storage.

+20

+1.4%

1.4K

total stars

Python

#273

datazip-inc/olake

Fastest open-source data pipeline tool for replicating databases to data lakes in Apache Iceberg format.

+20

+1.6%

1.3K

total stars

#274

apache/ozone

Scalable, reliable, distributed storage system optimized for data analytics and object store workloads.

+20

+1.7%

1.2K

total stars

Java

#275

vaexio/vaex

A high-performance Python library for working with large tabular datasets, offering efficient data manipulation and visualization.

+19

+0.2%

8.5K

total stars

Python

#276

DataLinkDC/dinky

Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.

+19

+0.5%

3.7K

total stars

Java

#277

apache/avro

Apache Avro is a data serialization system for efficient storage and transmission of structured data.

+19

+0.6%

3.2K

total stars

Java

#278

alibaba/clusterdata

A dataset of cluster data collected from Alibaba's production clusters for cluster management research.

+19

+1.0%

2.0K

total stars

Jupyter Notebook

#279

ptyadana/SQL-Data-Analysis-and-Visualization-Projects

This GitHub repository contains SQL data analysis and visualization projects using various tools and databases.

+19

+1.1%

1.7K

total stars

Jupyter Notebook

#280

koaning/drawdata

A Python library that allows developers to easily draw datasets within their notebooks.

+19

+1.2%

1.6K

total stars

JavaScript

#281

felt/tippecanoe

Build vector tilesets from large collections of GeoJSON features.

+19

+1.3%

1.4K

total stars

C++

#282

manami-project/anime-offline-database

This repository provides a comprehensive JSON dataset containing metadata on anime series, movies, and cross-references to various anime sites.

+19

+1.6%

1.2K

total stars

Makefile

#283

Azure/AzurePublicDataset

Azure/AzurePublicDataset is a repository containing Microsoft Azure Traces, a Jupyter Notebook-based resource.

+19

+1.8%

1.1K

total stars

Jupyter Notebook

#284

stephencelis/SQLite.swift

A type-safe, Swift-language layer over SQLite3 for building database-backed Swift applications.

+18

+0.2%

10.1K

total stars

Swift

#285

apache/pinot

Apache Pinot is a realtime distributed OLAP datastore for fast querying of large datasets.

+18

+0.3%

6.0K

total stars

Java

#286

ujjwalkarn/DataSciencePython

A Python library for common data analysis and machine learning tasks

+18

+0.3%

5.7K

total stars

Python

#287

hosseinmoein/DataFrame

C++ DataFrame library for statistical, financial, and machine learning analysis.

+18

+0.6%

2.9K

total stars

C++

#288

chdb-io/chdb

An in-process OLAP SQL Engine powered by ClickHouse, enabling fast and efficient data analysis.

+18

+0.7%

2.6K

total stars

C++

#289

colour-science/colour

A comprehensive Python library for color science and color space conversions.

+18

+0.7%

2.5K

total stars

Python

#290

avhz/RustQuant

A Rust library for quantitative finance, including tools for machine learning, option pricing, and trading.

+18

+1.1%

1.7K

total stars

Rust

#291

datalevin/datalevin

A simple, fast and versatile Datalog database written in Clojure for vibe coders.

+18

+1.3%

1.4K

total stars

Clojure

#292

lvgalvao/data-engineering-roadmap

Comprehensive roadmap for data engineering and AI development in Python

+18

+1.6%

1.1K

total stars

Python

#293

rordenlab/dcm2niix

A DICOM to NIfTI converter for medical imaging research and neuroimaging applications.

+18

+1.6%

1.1K

total stars

C++

#294

Automattic/mongoose

Mongoose is a MongoDB object modeling tool for Node.js and Deno, simplifying database interactions with schemas and models.

+17

+0.1%

27.5K

total stars

JavaScript

#295

prestodb/presto

Presto is an open-source distributed SQL query engine for big data, allowing fast analysis of large datasets.

+17

+0.1%

16.7K

total stars

Java

#296

apache/druid

Apache Druid is a high-performance real-time analytics database for vibe coders working with data-intensive applications.

+17

+0.1%

13.9K

total stars

Java

#297

kurrent-io/KurrentDB

KurrentDB is an event-native database designed for modern software and event-driven architectures.

+17

+0.3%

5.7K

total stars

#298

deepseek-ai/smallpond

A lightweight data processing framework built on DuckDB and 3FS for vibe coders working with AI tools.

+17

+0.3%

4.9K

total stars

Python

#299

ydb-platform/ydb

An open-source distributed SQL database with high availability, scalability, and ACID transactions.

+17

+0.4%

4.7K

total stars

C++

#300

RoaringBitmap/RoaringBitmap

A high-performance compressed bitset library for Java used in Apache Spark, Netflix Atlas, and others.

+17

+0.5%

3.8K

total stars

Java

1...57...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.