Trending Projects

Discover the fastest growing open source projects

Showing 251-300 of 897 trending projects

#251
oceanbase/seekdb

AI-native database unifying vector, text, and structured data for hybrid search and in-database AI workflows.

+22
+0.9%
2.4K
total stars
#252
zarr-developers/zarr-python

An efficient and compressed N-dimensional array library for Python, useful for data scientists and ML engineers.

+22
+1.1%
1.9K
total stars
#253
GreenmaskIO/greenmask

A Go-based tool for database anonymization and synthetic data generation to help with security, QA, and data masking.

+22
+1.4%
1.6K
total stars
#254
JoinQuant/jqdatasdk

A Python package for easy access to financial market data in China for quantitative finance and FinTech applications.

+22
+1.8%
1.2K
total stars
#255
allegro/bigcache

Efficient in-memory cache in Go for storing and retrieving large amounts of data.

+21
+0.3%
8.1K
total stars
#256
xiangyuecn/AreaCity-JsSpider-StatsGov

Comprehensive collection of city and administrative region data for China, with features like CSV export, JS code generation, and web scraping.

+21
+0.3%
6.4K
total stars
#257
ron-rs/ron

A Rust library for serializing and deserializing data in the Rusty Object Notation (RON) format.

+21
+0.6%
3.9K
total stars
#258
MakieOrg/Makie.jl

A powerful data visualization and plotting library for the Julia programming language.

+21
+0.8%
2.7K
total stars
#259
soedinglab/MMseqs2

MMseqs2 is an ultra-fast and sensitive bioinformatics tool for sequence search and clustering.

+21
+1.1%
2.0K
total stars
#260
apache/datafusion-ballista

Apache DataFusion Ballista is a distributed query engine for big data analysis, built with Rust and Arrow.

+21
+1.1%
2.0K
total stars
#261
LastAncientOne/Stock_Analysis_For_Quant

A collection of stock analysis tools across various programming languages and platforms.

+21
+1.1%
2.0K
total stars
#262
data-engineering-community/data-engineering-wiki

A community-driven wiki for learning data engineering, covering topics like data modeling, pipelines, and databases.

+21
+1.1%
1.9K
total stars
#263
LibRaw/LibRaw

LibRaw is a C++ library for reading RAW image files from digital cameras.

+21
+1.5%
1.4K
total stars
#264
projectnessie/nessie

Nessie is a transactional data catalog for data lakes that provides Git-like semantics and functionality.

+21
+1.5%
1.4K
total stars
#265
erikgrinaker/toydb

An educational distributed SQL database written in Rust, not focused on AI coding tools.

+20
+0.3%
7.2K
total stars
#266
bukosabino/ta

Technical Analysis Library using Pandas and Numpy for financial data analysis and trading strategies.

+20
+0.4%
4.9K
total stars
#267
briatte/awesome-network-analysis

A curated list of awesome resources for network analysis and visualization, with a focus on R tools.

+20
+0.5%
4.0K
total stars
#268
gee-community/geemap

A Python package for interactive geospatial analysis and visualization with Google Earth Engine.

+20
+0.5%
3.9K
total stars
#269
holistics/dbml

A database modeling language (DBML) that helps define and document database structures.

+20
+0.6%
3.5K
total stars
#270
apache/auron

The Auron accelerator framework leverages vectorized execution to speed up distributed computing on big data platforms like Spark.

+20
+1.2%
1.7K
total stars
#271
jldbc/pybaseball

A Python library for pulling current and historical baseball statistics, including Statcast, Baseball Reference, and FanGraphs data.

+20
+1.3%
1.6K
total stars
#272
pgvector/pgvector-python

A Python library that provides support for the pgvector vector database, enabling efficient vector search and storage.

+20
+1.4%
1.4K
total stars
#273
datazip-inc/olake

Fastest open-source data pipeline tool for replicating databases to data lakes in Apache Iceberg format.

+20
+1.6%
1.3K
total stars
#274
apache/ozone

Scalable, reliable, distributed storage system optimized for data analytics and object store workloads.

+20
+1.7%
1.2K
total stars
#275
vaexio/vaex

A high-performance Python library for working with large tabular datasets, offering efficient data manipulation and visualization.

+19
+0.2%
8.5K
total stars
#276
DataLinkDC/dinky

Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.

+19
+0.5%
3.7K
total stars
#277
apache/avro

Apache Avro is a data serialization system for efficient storage and transmission of structured data.

+19
+0.6%
3.2K
total stars
#278
alibaba/clusterdata

A dataset of cluster data collected from Alibaba's production clusters for cluster management research.

+19
+1.0%
2.0K
total stars
#279
ptyadana/SQL-Data-Analysis-and-Visualization-Projects

This GitHub repository contains SQL data analysis and visualization projects using various tools and databases.

+19
+1.1%
1.7K
total stars
#280
koaning/drawdata

A Python library that allows developers to easily draw datasets within their notebooks.

+19
+1.2%
1.6K
total stars
#281
felt/tippecanoe

Build vector tilesets from large collections of GeoJSON features.

+19
+1.3%
1.4K
total stars
#282
manami-project/anime-offline-database

This repository provides a comprehensive JSON dataset containing metadata on anime series, movies, and cross-references to various anime sites.

+19
+1.6%
1.2K
total stars
#283
Azure/AzurePublicDataset

Azure/AzurePublicDataset is a repository containing Microsoft Azure Traces, a Jupyter Notebook-based resource.

+19
+1.8%
1.1K
total stars
#284
stephencelis/SQLite.swift

A type-safe, Swift-language layer over SQLite3 for building database-backed Swift applications.

+18
+0.2%
10.1K
total stars
#285
apache/pinot

Apache Pinot is a realtime distributed OLAP datastore for fast querying of large datasets.

+18
+0.3%
6.0K
total stars
#286
ujjwalkarn/DataSciencePython

A Python library for common data analysis and machine learning tasks

+18
+0.3%
5.7K
total stars
#287
hosseinmoein/DataFrame

C++ DataFrame library for statistical, financial, and machine learning analysis.

+18
+0.6%
2.9K
total stars
#288
chdb-io/chdb

An in-process OLAP SQL Engine powered by ClickHouse, enabling fast and efficient data analysis.

+18
+0.7%
2.6K
total stars
#289
colour-science/colour

A comprehensive Python library for color science and color space conversions.

+18
+0.7%
2.5K
total stars
#290
avhz/RustQuant

A Rust library for quantitative finance, including tools for machine learning, option pricing, and trading.

+18
+1.1%
1.7K
total stars
#291
datalevin/datalevin

A simple, fast and versatile Datalog database written in Clojure for vibe coders.

+18
+1.3%
1.4K
total stars
#292
lvgalvao/data-engineering-roadmap

Comprehensive roadmap for data engineering and AI development in Python

+18
+1.6%
1.1K
total stars
#293
rordenlab/dcm2niix

A DICOM to NIfTI converter for medical imaging research and neuroimaging applications.

+18
+1.6%
1.1K
total stars
#294
Automattic/mongoose

Mongoose is a MongoDB object modeling tool for Node.js and Deno, simplifying database interactions with schemas and models.

+17
+0.1%
27.5K
total stars
#295
prestodb/presto

Presto is an open-source distributed SQL query engine for big data, allowing fast analysis of large datasets.

+17
+0.1%
16.7K
total stars
#296
apache/druid

Apache Druid is a high-performance real-time analytics database for vibe coders working with data-intensive applications.

+17
+0.1%
13.9K
total stars
#297
kurrent-io/KurrentDB

KurrentDB is an event-native database designed for modern software and event-driven architectures.

+17
+0.3%
5.7K
total stars
#298
deepseek-ai/smallpond

A lightweight data processing framework built on DuckDB and 3FS for vibe coders working with AI tools.

+17
+0.3%
4.9K
total stars
#299
ydb-platform/ydb

An open-source distributed SQL database with high availability, scalability, and ACID transactions.

+17
+0.4%
4.7K
total stars
#300
RoaringBitmap/RoaringBitmap

A high-performance compressed bitset library for Java used in Apache Spark, Netflix Atlas, and others.

+17
+0.5%
3.8K
total stars
1...57...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.