Trending Projects

Discover the fastest growing open source projects

Showing 601-650 of 897 trending projects

#601
apache/amoro

Apache Amoro is an open-source Lakehouse management system built on big data formats like Flink, Hudi, and Iceberg.

+42
+3.9%
1.1K
total stars
#602
apache/phoenix

Apache Phoenix is a scalable, distributed SQL engine that connects to HBase for low-latency queries.

+42
+4.2%
1.1K
total stars
#603
dotnetcore/FreeSql

An ORM (Object-Relational Mapping) library for .NET that supports a wide range of database providers, including SQL Server, MySQL, PostgreSQL, and more.

+41
+0.9%
4.4K
total stars
#604
PyWavelets/pywt

PyWavelets is a Python library for wavelet transform algorithms and techniques, useful for image and signal processing.

+41
+1.8%
2.3K
total stars
#605
TuGraph-family/tugraph-db

TuGraph-DB is a high-performance graph database built for fast and efficient graph data processing.

+41
+2.5%
1.7K
total stars
#606
reata/sqllineage

SQL Lineage Analysis Tool that provides data discovery and governance insights through Python.

+41
+2.6%
1.6K
total stars
#607
percona/percona-xtrabackup

Open source hot backup tool for InnoDB and XtraDB databases

+41
+2.8%
1.5K
total stars
#608
cloudera/hue

Open source SQL query assistant service for databases and data warehouses

+41
+2.9%
1.4K
total stars
#609
datavane/tis

A Java-based framework for building agile DataOps pipelines using tools like Flink, DataX, and Chunjun with a web UI.

+41
+3.3%
1.3K
total stars
#610
YuLab-SMU/clusterProfiler

A comprehensive enrichment analysis tool for interpreting omics data, with support for GO, KEGG, and more.

+41
+3.6%
1.2K
total stars
#611
zinggAI/zingg

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

+41
+3.7%
1.2K
total stars
#612
PDAL/PDAL

PDAL is a C++ library for processing point cloud data, similar to GDAL for raster data.

+40
+3.1%
1.3K
total stars
#613
wesm/msgvault

Archive, search, and analyze your entire email/chat history offline with DuckDB-powered analytics and AI queries.

+40
+3.2%
1.3K
total stars
#614
cvxgrp/cvxportfolio

A Python library for portfolio optimization and back-testing in finance.

+40
+3.5%
1.2K
total stars
#615
fluentmigrator/fluentmigrator

Fluent Migrator is a .NET migration framework for managing database schema changes across multiple database providers.

+39
+1.1%
3.5K
total stars
#616
tensorchord/pgvecto.rs

Scalable, low-latency vector search in Postgres, revolutionizing vector search and databases.

+39
+1.8%
2.2K
total stars
#617
Werneror/Poetry

This repository provides a comprehensive dataset of over 850,000 Chinese poems from ancient to modern times, making it a valuable resource for developers working with Chinese poetry.

+39
+2.3%
1.7K
total stars
#618
Awesome-Image-Registration-Organization/awesome-image-registration

A curated collection of resources related to image registration, including books, papers, videos, and toolboxes.

+39
+2.7%
1.5K
total stars
#619
patx/pickledb

An in-memory key-value store using Python's orjson module for persistence, with SQLite support.

+39
+3.8%
1.1K
total stars
#620
alibaba/druid

Druid is a high-performance database connection pool for Java applications, designed for monitoring and management.

+38
+0.1%
28.2K
total stars
#621
cayleygraph/cayley

An open-source graph database written in Go, useful for building applications that require linked data and graph-based queries.

+38
+0.3%
15.0K
total stars
#622
databricks/koalas

Koalas is a pandas-like API for Apache Spark, enabling data scientists to work with big data using familiar pandas syntax.

+38
+1.1%
3.4K
total stars
#623
tirthajyoti/Data-science-best-resources

A curated collection of resources for data science and machine learning enthusiasts.

+38
+1.2%
3.2K
total stars
#624
mourner/rbush

RBush is a high-performance JavaScript R-tree-based 2D spatial index for points and rectangles.

+38
+1.4%
2.7K
total stars
#625
broadinstitute/gatk

Official code repository for the Genome Analysis Toolkit (GATK), a bioinformatics library for working with next-generation DNA sequencing data.

+38
+2.0%
1.9K
total stars
#626
fonnesbeck/statistical-analysis-python-tutorial

A tutorial for performing statistical data analysis using Python, covering topics like regression, hypothesis testing, and more.

+38
+2.3%
1.7K
total stars
#627
pysal/pysal

PySAL is a Python Spatial Analysis Library meta-package for geographical data analysis and modeling.

+38
+2.6%
1.5K
total stars
#628
PyO3/rust-numpy

Rust-based bindings for the NumPy C-API, enabling developers to leverage Rust for numerical computing.

+38
+2.9%
1.3K
total stars
#629
x2bool/xlite

A Rust library that enables querying Excel spreadsheets using SQLite, making data extraction and analysis more efficient.

+38
+3.0%
1.3K
total stars
#630
LongOnly/Quantitative-Notebooks

Educational notebooks on quantitative finance, algorithmic trading, financial modeling, and investment strategy.

+38
+3.0%
1.3K
total stars
#631
paulvangentcom/heartrate_analysis_python

A Python package for analyzing heart rate data from PPG and ECG signals.

+38
+3.6%
1.1K
total stars
#632
konradhalas/dacite

A simple Python library for creating dataclasses from dictionaries.

+37
+1.9%
2.0K
total stars
#633
DaveSkender/Stock.Indicators

A C# NuGet package that provides technical indicators and trading insights for financial market data analysis.

+37
+3.2%
1.2K
total stars
#634
realm/realm-core

Core database component for the Realm Mobile Database SDKs, a popular NoSQL database for mobile apps.

+37
+3.7%
1.0K
total stars
#635
apache/zeppelin

Zeppelin is a web-based notebook that enables data-driven, interactive data analytics and collaborative documents.

+36
+0.6%
6.6K
total stars
#636
psycopg/psycopg2

A Python database adapter for PostgreSQL, allowing developers to interact with their databases.

+36
+1.0%
3.6K
total stars
#637
crate/crate

CrateDB is a distributed, scalable SQL database for storing and analyzing massive amounts of data in near real-time.

+35
+0.8%
4.4K
total stars
#638
risinglightdb/risinglight

An educational OLAP database system built in Rust for learning and experimentation.

+35
+2.0%
1.8K
total stars
#639
chaisql/chai

A modern, embedded SQL database written in Go for embedded and mobile applications.

+35
+2.1%
1.7K
total stars
#640
meta-pytorch/data

A PyTorch library for data loading and utility functions shared across PyTorch domain libraries.

+35
+2.9%
1.2K
total stars
#641
XD-DENG/SQL-exercise

A collection of SQL practice problems for developers to improve their SQL skills.

+34
+2.4%
1.5K
total stars
#642
toluaina/pgsync

A Python library that syncs data from Postgres to Elasticsearch/OpenSearch, enabling real-time data pipelines.

+34
+2.5%
1.4K
total stars
#643
WeBankFinTech/DataSphereStudio

DataSphereStudio is a one-stop data application development and management portal covering data exchange, analysis, and visualization.

+33
+1.0%
3.3K
total stars
#644
geekyouth/SZT-bigdata

This is a big data analysis system for the Shenzhen metro with support for various data processing tools.

+33
+1.4%
2.4K
total stars
#645
uwdata/arquero

A JavaScript library for efficient querying and transformation of array-backed data tables.

+33
+2.3%
1.5K
total stars
#646
NiuTrans/Classical-Modern

A parallel corpus of classical Chinese and modern Chinese texts for language processing and analysis.

+33
+2.4%
1.4K
total stars
#647
databricks/LearningSparkV2

This is a book that teaches how to use Apache Spark for lightning-fast data analytics.

+33
+2.5%
1.4K
total stars
#648
PoloDB/PoloDB

PoloDB is an embedded document database written in Rust for building cross-platform, local-first applications.

+33
+2.9%
1.2K
total stars
#649
apache/incubator-xtable

Apache XTable is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.

+33
+2.9%
1.2K
total stars
#650
jblindsay/whitebox-tools

An advanced geospatial data analysis platform for tasks like geomorphology, hydrology, and remote sensing.

+33
+3.0%
1.1K
total stars
1...1214...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.