Trending Projects

Discover the fastest growing open source projects

Showing 551-600 of 897 trending projects

#551
dremio/dremio-oss

Dremio is an open-source data analytics platform that simplifies and accelerates big data analysis.

+49
+3.4%
1.5K
total stars
#552
TeoMeWhy/teomerefs

A comprehensive guide to technical references for data careers, including Python, machine learning, and data science.

+49
+4.0%
1.3K
total stars
#553
dicedb/dicedb

DiceDB is an open-source, fast, reactive, in-memory database optimized for modern hardware.

+48
+0.5%
10.7K
total stars
#554
sqlkata/querybuilder

SQL query builder for C# developers, supporting multiple databases and complex queries.

+48
+1.5%
3.3K
total stars
#555
linq2db/linq2db

Linq to database provider for .NET, supporting various database engines.

+48
+1.5%
3.2K
total stars
#556
alibaba/clusterdata

A dataset of cluster data collected from Alibaba's production clusters for cluster management research.

+48
+2.5%
2.0K
total stars
#557
tidyverse/tidyr

tidyr is an R package that provides a set of functions to tidy messy data into a format suitable for analysis.

+48
+3.5%
1.4K
total stars
#558
manami-project/anime-offline-database

This repository provides a comprehensive JSON dataset containing metadata on anime series, movies, and cross-references to various anime sites.

+48
+4.0%
1.2K
total stars
#559
mwaskom/seaborn-data

This is a data repository for the Seaborn data visualization library in Python.

+47
+2.6%
1.8K
total stars
#560
google/tensorstore

A C++ library for reading and writing large multi-dimensional arrays, useful for scientific and data-intensive applications.

+47
+3.2%
1.5K
total stars
#561
JoinQuant/jqdatasdk

A Python package for easy access to financial market data in China for quantitative finance and FinTech applications.

+47
+4.0%
1.2K
total stars
#562
Tencent/wcdb

WCDB is a cross-platform database framework developed by WeChat for Android, iOS, Linux, macOS, and Windows.

+46
+0.4%
11.7K
total stars
#563
xiangyuecn/AreaCity-JsSpider-StatsGov

Comprehensive collection of city and administrative region data for China, with features like CSV export, JS code generation, and web scraping.

+46
+0.7%
6.4K
total stars
#564
chdb-io/chdb

An in-process OLAP SQL Engine powered by ClickHouse, enabling fast and efficient data analysis.

+46
+1.8%
2.6K
total stars
#565
LibRaw/LibRaw

LibRaw is a C++ library for reading RAW image files from digital cameras.

+46
+3.3%
1.4K
total stars
#566
apache/ozone

Scalable, reliable, distributed storage system optimized for data analytics and object store workloads.

+46
+4.1%
1.2K
total stars
#567
ddotta/awesome-polars

A curated list of Polars, an open-source, high-performance data manipulation library for Python and Rust.

+46
+4.5%
1.1K
total stars
#568
dbeaver/cloudbeaver

Cloud-based database manager UI for querying, managing, and visualizing databases across multiple platforms.

+45
+1.0%
4.7K
total stars
#569
avhz/RustQuant

A Rust library for quantitative finance, including tools for machine learning, option pricing, and trading.

+45
+2.8%
1.7K
total stars
#570
pgvector/pgvector-python

A Python library that provides support for the pgvector vector database, enabling efficient vector search and storage.

+45
+3.2%
1.4K
total stars
#571
avehtari/BDA_py_demos

Provides Bayesian data analysis demos in Python for developers interested in probabilistic modeling.

+45
+4.5%
1.0K
total stars
#572
mysql/mysql-server

Open-source relational database engine powering web apps, APIs, and data-driven backends worldwide.

+44
+0.4%
12.2K
total stars
#573
stephencelis/SQLite.swift

A type-safe, Swift-language layer over SQLite3 for building database-backed Swift applications.

+44
+0.4%
10.1K
total stars
#574
liam-hq/liam

Automatically generates beautiful and easy-to-read ER diagrams from your database.

+44
+0.9%
4.7K
total stars
#575
apache/datafusion-ballista

Apache DataFusion Ballista is a distributed query engine for big data analysis, built with Rust and Arrow.

+44
+2.3%
2.0K
total stars
#576
igraph/igraph

A powerful C library for analyzing complex networks and graph-based data structures.

+44
+2.3%
1.9K
total stars
#577
collabH/bigdata-growth

A comprehensive repository covering big data knowledge, including data warehouse modeling, real-time computing, Hadoop, Spark, and more.

+44
+2.6%
1.7K
total stars
#578
fonnesbeck/statistical-analysis-python-tutorial

A tutorial for performing statistical data analysis using Python, covering topics like regression, hypothesis testing, and more.

+44
+2.6%
1.7K
total stars
#579
objectbox/objectbox-go

Embedded Go Database, a fast open-source NoSQL database solution for Go projects.

+44
+3.6%
1.3K
total stars
#580
alecthw/mmdb_china_ip_list

A library for generating MaxMind GeoIP2 databases for China IP addresses.

+44
+4.1%
1.1K
total stars
#581
rhiever/datacleaner

A Python tool that automatically cleans and preprocesses data for analysis and machine learning.

+44
+4.3%
1.1K
total stars
#582
Netflix/maestro

Maestro is Netflix's workflow orchestrator for building data pipelines and batch processing workflows.

+43
+1.2%
3.7K
total stars
#583
ClickHouse/clickhouse-go

A Go driver for the ClickHouse analytics database, enabling fast and efficient data processing.

+43
+1.3%
3.3K
total stars
#584
timeplus-io/proton

Fast, single-binary C++ SQL ETL pipeline for stream processing, observability, analytics, and AI/ML.

+43
+2.0%
2.2K
total stars
#585
litedb-org/LiteDB

LiteDB is a lightweight, embedded NoSQL document database for .NET applications that can be used in a single data file.

+42
+0.5%
9.4K
total stars
#586
hazelcast/hazelcast

Hazelcast is a high-performance, distributed in-memory data platform for real-time insights and stream processing.

+42
+0.6%
6.6K
total stars
#587
ujjwalkarn/DataSciencePython

A Python library for common data analysis and machine learning tasks

+42
+0.7%
5.7K
total stars
#588
MakieOrg/Makie.jl

A powerful data visualization and plotting library for the Julia programming language.

+42
+1.6%
2.7K
total stars
#589
galaxyproject/galaxy

An open-source, community-driven platform for data-intensive scientific analysis and visualization.

+42
+2.5%
1.7K
total stars
#590
amphi-ai/amphi-etl

A visual data preparation tool powered by Python, designed for data analysis and ETL tasks.

+42
+3.2%
1.4K
total stars
#591
paulvangentcom/heartrate_analysis_python

A Python package for analyzing heart rate data from PPG and ECG signals.

+42
+4.0%
1.1K
total stars
#592
gee-community/geemap

A Python package for interactive geospatial analysis and visualization with Google Earth Engine.

+41
+1.1%
3.9K
total stars
#593
hosseinmoein/DataFrame

C++ DataFrame library for statistical, financial, and machine learning analysis.

+41
+1.4%
2.9K
total stars
#594
ptyadana/SQL-Data-Analysis-and-Visualization-Projects

This GitHub repository contains SQL data analysis and visualization projects using various tools and databases.

+41
+2.5%
1.7K
total stars
#595
Yimeng-Zhang/feature-engineering-and-feature-selection

A comprehensive guide to feature engineering and feature selection techniques in Python, with examples.

+41
+2.6%
1.6K
total stars
#596
mono/taglib-sharp

A C# library for reading and writing metadata in media files, useful for audio and video processing applications.

+41
+3.0%
1.4K
total stars
#597
Hiflylabs/awesome-dbt

A curated list of awesome resources for the data transformation tool dbt, focused on analytics engineering.

+40
+2.5%
1.6K
total stars
#598
wesm/msgvault

Archive, search, and analyze your entire email/chat history offline with DuckDB-powered analytics and AI queries.

+40
+3.2%
1.3K
total stars
#599
jbmusso/awesome-graph

A curated list of resources for graph databases and graph computing tools, useful for developers working with graph-based data.

+40
+3.3%
1.2K
total stars
#600
samapriya/awesome-gee-community-datasets

A community-driven catalog of geospatial datasets for use with Google Earth Engine.

+40
+3.8%
1.1K
total stars
1...1113...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.