Trending Projects

Discover the fastest growing open source projects

Showing 601-650 of 897 trending projects

#601

apache/amoro

Apache Amoro is an open-source Lakehouse management system built on big data formats like Flink, Hudi, and Iceberg.

+42

+3.9%

1.1K

total stars

Java

#602

apache/phoenix

Apache Phoenix is a scalable, distributed SQL engine that connects to HBase for low-latency queries.

+42

+4.2%

1.1K

total stars

Java

#603

dotnetcore/FreeSql

An ORM (Object-Relational Mapping) library for .NET that supports a wide range of database providers, including SQL Server, MySQL, PostgreSQL, and more.

+41

+0.9%

4.4K

total stars

#604

PyWavelets/pywt

PyWavelets is a Python library for wavelet transform algorithms and techniques, useful for image and signal processing.

+41

+1.8%

2.3K

total stars

Python

#605

TuGraph-family/tugraph-db

TuGraph-DB is a high-performance graph database built for fast and efficient graph data processing.

+41

+2.5%

1.7K

total stars

C++

#606

reata/sqllineage

SQL Lineage Analysis Tool that provides data discovery and governance insights through Python.

+41

+2.6%

1.6K

total stars

Python

#607

percona/percona-xtrabackup

Open source hot backup tool for InnoDB and XtraDB databases

+41

+2.8%

1.5K

total stars

C++

#608

cloudera/hue

Open source SQL query assistant service for databases and data warehouses

+41

+2.9%

1.4K

total stars

JavaScript

#609

datavane/tis

A Java-based framework for building agile DataOps pipelines using tools like Flink, DataX, and Chunjun with a web UI.

+41

+3.3%

1.3K

total stars

Java

#610

YuLab-SMU/clusterProfiler

A comprehensive enrichment analysis tool for interpreting omics data, with support for GO, KEGG, and more.

+41

+3.6%

1.2K

total stars

#611

zinggAI/zingg

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

+41

+3.7%

1.2K

total stars

Java

#612

PDAL/PDAL

PDAL is a C++ library for processing point cloud data, similar to GDAL for raster data.

+40

+3.1%

1.3K

total stars

C++

#613

wesm/msgvault

Archive, search, and analyze your entire email/chat history offline with DuckDB-powered analytics and AI queries.

+40

+3.2%

1.3K

total stars

#614

cvxgrp/cvxportfolio

A Python library for portfolio optimization and back-testing in finance.

+40

+3.5%

1.2K

total stars

Python

#615

fluentmigrator/fluentmigrator

Fluent Migrator is a .NET migration framework for managing database schema changes across multiple database providers.

+39

+1.1%

3.5K

total stars

#616

tensorchord/pgvecto.rs

Scalable, low-latency vector search in Postgres, revolutionizing vector search and databases.

+39

+1.8%

2.2K

total stars

Rust

#617

Werneror/Poetry

This repository provides a comprehensive dataset of over 850,000 Chinese poems from ancient to modern times, making it a valuable resource for developers working with Chinese poetry.

+39

+2.3%

1.7K

total stars

Python

#618

Awesome-Image-Registration-Organization/awesome-image-registration

A curated collection of resources related to image registration, including books, papers, videos, and toolboxes.

+39

+2.7%

1.5K

total stars

#619

patx/pickledb

An in-memory key-value store using Python's orjson module for persistence, with SQLite support.

+39

+3.8%

1.1K

total stars

Python

#620

alibaba/druid

Druid is a high-performance database connection pool for Java applications, designed for monitoring and management.

+38

+0.1%

28.2K

total stars

Java

#621

cayleygraph/cayley

An open-source graph database written in Go, useful for building applications that require linked data and graph-based queries.

+38

+0.3%

15.0K

total stars

#622

databricks/koalas

Koalas is a pandas-like API for Apache Spark, enabling data scientists to work with big data using familiar pandas syntax.

+38

+1.1%

3.4K

total stars

Python

#623

tirthajyoti/Data-science-best-resources

A curated collection of resources for data science and machine learning enthusiasts.

+38

+1.2%

3.2K

total stars

#624

mourner/rbush

RBush is a high-performance JavaScript R-tree-based 2D spatial index for points and rectangles.

+38

+1.4%

2.7K

total stars

JavaScript

#625

broadinstitute/gatk

Official code repository for the Genome Analysis Toolkit (GATK), a bioinformatics library for working with next-generation DNA sequencing data.

+38

+2.0%

1.9K

total stars

Java

#626

fonnesbeck/statistical-analysis-python-tutorial

A tutorial for performing statistical data analysis using Python, covering topics like regression, hypothesis testing, and more.

+38

+2.3%

1.7K

total stars

HTML

#627

pysal/pysal

PySAL is a Python Spatial Analysis Library meta-package for geographical data analysis and modeling.

+38

+2.6%

1.5K

total stars

Python

#628

PyO3/rust-numpy

Rust-based bindings for the NumPy C-API, enabling developers to leverage Rust for numerical computing.

+38

+2.9%

1.3K

total stars

Rust

#629

x2bool/xlite

A Rust library that enables querying Excel spreadsheets using SQLite, making data extraction and analysis more efficient.

+38

+3.0%

1.3K

total stars

Rust

#630

LongOnly/Quantitative-Notebooks

Educational notebooks on quantitative finance, algorithmic trading, financial modeling, and investment strategy.

+38

+3.0%

1.3K

total stars

Jupyter Notebook

#631

paulvangentcom/heartrate_analysis_python

A Python package for analyzing heart rate data from PPG and ECG signals.

+38

+3.6%

1.1K

total stars

Python

#632

konradhalas/dacite

A simple Python library for creating dataclasses from dictionaries.

+37

+1.9%

2.0K

total stars

Python

#633

DaveSkender/Stock.Indicators

A C# NuGet package that provides technical indicators and trading insights for financial market data analysis.

+37

+3.2%

1.2K

total stars

#634

realm/realm-core

Core database component for the Realm Mobile Database SDKs, a popular NoSQL database for mobile apps.

+37

+3.7%

1.0K

total stars

C++

#635

apache/zeppelin

Zeppelin is a web-based notebook that enables data-driven, interactive data analytics and collaborative documents.

+36

+0.6%

6.6K

total stars

Java

#636

psycopg/psycopg2

A Python database adapter for PostgreSQL, allowing developers to interact with their databases.

+36

+1.0%

3.6K

total stars

#637

crate/crate

CrateDB is a distributed, scalable SQL database for storing and analyzing massive amounts of data in near real-time.

+35

+0.8%

4.4K

total stars

Java

#638

risinglightdb/risinglight

An educational OLAP database system built in Rust for learning and experimentation.

+35

+2.0%

1.8K

total stars

Rust

#639

chaisql/chai

A modern, embedded SQL database written in Go for embedded and mobile applications.

+35

+2.1%

1.7K

total stars

#640

meta-pytorch/data

A PyTorch library for data loading and utility functions shared across PyTorch domain libraries.

+35

+2.9%

1.2K

total stars

Python

#641

XD-DENG/SQL-exercise

A collection of SQL practice problems for developers to improve their SQL skills.

+34

+2.4%

1.5K

total stars

#642

toluaina/pgsync

A Python library that syncs data from Postgres to Elasticsearch/OpenSearch, enabling real-time data pipelines.

+34

+2.5%

1.4K

total stars

Python

#643

WeBankFinTech/DataSphereStudio

DataSphereStudio is a one-stop data application development and management portal covering data exchange, analysis, and visualization.

+33

+1.0%

3.3K

total stars

Java

#644

geekyouth/SZT-bigdata

This is a big data analysis system for the Shenzhen metro with support for various data processing tools.

+33

+1.4%

2.4K

total stars

Scala

#645

uwdata/arquero

A JavaScript library for efficient querying and transformation of array-backed data tables.

+33

+2.3%

1.5K

total stars

JavaScript

#646

NiuTrans/Classical-Modern

A parallel corpus of classical Chinese and modern Chinese texts for language processing and analysis.

+33

+2.4%

1.4K

total stars

Python

#647

databricks/LearningSparkV2

This is a book that teaches how to use Apache Spark for lightning-fast data analytics.

+33

+2.5%

1.4K

total stars

Scala

#648

PoloDB/PoloDB

PoloDB is an embedded document database written in Rust for building cross-platform, local-first applications.

+33

+2.9%

1.2K

total stars

Rust

#649

apache/incubator-xtable

Apache XTable is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.

+33

+2.9%

1.2K

total stars

Java

#650

jblindsay/whitebox-tools

An advanced geospatial data analysis platform for tasks like geomorphology, hydrology, and remote sensing.

+33

+3.0%

1.1K

total stars

Rust

1...1214...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.