Trending Projects

Discover the fastest growing open source projects

Showing 351-400 of 897 trending projects

#351
DataLinkDC/dinky

Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.

+248
+7.2%
3.7K
total stars
#352
antonycourtney/tad

A desktop application for viewing and analyzing tabular data, with support for CSV, Parquet, and DuckDB.

+247
+7.8%
3.4K
total stars
#353
datastacktv/data-engineer-roadmap

This is a roadmap for becoming a data engineer, not a developer discovery platform for vibe coders.

+242
+1.9%
12.7K
total stars
#354
scrollmapper/bible_databases

This GitHub repository provides a collection of Bible versions and cross-reference databases, but it does not appear to be related to the given developer discovery platform focused on vibe coders.

+238
+18.8%
1.5K
total stars
#355
jvns/pandas-cookbook

Pandas Cookbook is a collection of recipes for using Python's powerful data analysis library, Pandas.

+237
+3.5%
7.0K
total stars
#356
ujjwalkarn/DataSciencePython

A Python library for common data analysis and machine learning tasks

+235
+4.3%
5.7K
total stars
#357
openmaptiles/openmaptiles

OpenMapTiles is an open-source vector tile schema implementation for creating custom map tiles.

+235
+8.4%
3.0K
total stars
#358
nullptrlabs/pgmodeler

An open-source data modeling tool designed for PostgreSQL, allowing developers to generate DDL commands visually.

+234
+7.1%
3.5K
total stars
#359
apache/parquet-format

Apache Parquet Format, a columnar data storage format used in the Apache Hadoop ecosystem.

+233
+11.5%
2.3K
total stars
#360
alibaba/druid

Druid is a high-performance database connection pool for Java applications, designed for monitoring and management.

+232
+0.8%
28.2K
total stars
#361
dgraph-io/badger

Fast, embeddable key-value database written in Go for building high-performance storage applications.

+231
+1.5%
15.5K
total stars
#362
h5py/h5py

A Python library for accessing the HDF5 binary data format, a popular format for scientific and numerical data.

+231
+11.7%
2.2K
total stars
#363
datawhalechina/competition-baseline

A collection of code examples and baselines for common data science and machine learning competitions.

+230
+5.1%
4.7K
total stars
#364
data-engineering-community/data-engineering-wiki

A community-driven wiki for learning data engineering, covering topics like data modeling, pipelines, and databases.

+229
+13.7%
1.9K
total stars
#365
google/cluster-data

This is a dataset of Borg cluster traces from Google, which can be useful for researchers and developers in the field of distributed systems and cloud infrastructure.

+228
+28.0%
1.0K
total stars
#366
nalgeon/sqlean

The ultimate set of SQLite extensions for developers building applications with SQLite databases.

+227
+5.6%
4.3K
total stars
#367
JasonKessler/scattertext

A Python library for creating beautiful visualizations of language differences across document types.

+227
+10.8%
2.3K
total stars
#368
apache/hive

Apache Hive is a data warehouse software built on top of Apache Hadoop for querying and managing large datasets.

+225
+3.9%
6.0K
total stars
#369
huandu/go-sqlbuilder

A flexible and powerful SQL string builder library plus a zero-config ORM for Go developers.

+225
+15.6%
1.7K
total stars
#370
galaxyproject/galaxy

An open-source, community-driven platform for data-intensive scientific analysis and visualization.

+222
+14.6%
1.7K
total stars
#371
crazyhottommy/RNA-seq-analysis

This GitHub repository contains notes and code for analyzing RNA-seq data using Python and Snakemake.

+222
+26.1%
1.1K
total stars
#372
pgvector/pgvector-python

A Python library that provides support for the pgvector vector database, enabling efficient vector search and storage.

+221
+18.0%
1.4K
total stars
#373
DotNetNext/SqlSugar

A powerful, multi-database ORM for .NET that supports a wide range of SQL databases and provides a seamless data access layer.

+218
+3.9%
5.8K
total stars
#374
seandavi/awesome-single-cell

A curated list of software packages and data resources for single-cell analysis, including RNA-seq and ATAC-seq.

+218
+6.3%
3.7K
total stars
#375
hosseinmoein/DataFrame

C++ DataFrame library for statistical, financial, and machine learning analysis.

+218
+8.1%
2.9K
total stars
#376
gee-community/geemap

A Python package for interactive geospatial analysis and visualization with Google Earth Engine.

+217
+5.9%
3.9K
total stars
#377
jupyter/docker-stacks

Docker images containing Jupyter applications for data science and machine learning workflows.

+216
+2.6%
8.4K
total stars
#378
VictoriaMetrics/fastcache

Fast in-memory cache library for Go with low GC overhead, optimized for a large number of entries.

+216
+10.2%
2.3K
total stars
#379
qinwf/awesome-R

A curated list of awesome R packages, frameworks and software for data analysis and data science.

+215
+3.5%
6.4K
total stars
#380
mdeff/fma

A dataset for music analysis and research, with support for deep learning and reproducible research.

+214
+9.1%
2.6K
total stars
#381
malloydata/malloy

Malloy is an open-source language for describing data relationships and transformations.

+214
+9.8%
2.4K
total stars
#382
mootdx/mootdx

A Python library for conveniently reading data from the Tongdaxin financial data platform.

+214
+18.3%
1.4K
total stars
#383
hazelcast/hazelcast

Hazelcast is a high-performance, distributed in-memory data platform for real-time insights and stream processing.

+213
+3.3%
6.6K
total stars
#384
sqlkata/querybuilder

SQL query builder for C# developers, supporting multiple databases and complex queries.

+213
+6.8%
3.3K
total stars
#385
JoshClose/CsvHelper

A C# library for reading and writing CSV files, with support for a wide range of CSV file formats.

+212
+4.2%
5.2K
total stars
#386
orbitdb/orbitdb

OrbitDB is a peer-to-peer database for the decentralized web, enabling developers to build offline-first, distributed applications.

+211
+2.5%
8.7K
total stars
#387
stephencelis/SQLite.swift

A type-safe, Swift-language layer over SQLite3 for building database-backed Swift applications.

+210
+2.1%
10.1K
total stars
#388
xerial/sqlite-jdbc

SQLite JDBC Driver - a Java library for accessing SQLite databases

+210
+7.0%
3.2K
total stars
#389
faroit/awesome-python-scientific-audio

Curated list of Python software and packages for scientific research in audio

+209
+14.2%
1.7K
total stars
#390
apache/auron

The Auron accelerator framework leverages vectorized execution to speed up distributed computing on big data platforms like Spark.

+208
+13.8%
1.7K
total stars
#391
ChawlaAvi/Daily-Dose-of-Data-Science

A collection of code snippets and tutorials for data science and data analysis in Python.

+208
+21.9%
1.2K
total stars
#392
apache/datafusion-ballista

Apache DataFusion Ballista is a distributed query engine for big data analysis, built with Rust and Arrow.

+207
+11.7%
2.0K
total stars
#393
felt/tippecanoe

Build vector tilesets from large collections of GeoJSON features.

+207
+16.7%
1.4K
total stars
#394
Azure/AzurePublicDataset

Azure/AzurePublicDataset is a repository containing Microsoft Azure Traces, a Jupyter Notebook-based resource.

+205
+23.3%
1.1K
total stars
#395
canonical/dqlite

An embeddable, replicated, and fault-tolerant SQL engine for building robust and scalable applications.

+204
+5.0%
4.3K
total stars
#396
ptyadana/SQL-Data-Analysis-and-Visualization-Projects

This GitHub repository contains SQL data analysis and visualization projects using various tools and databases.

+203
+13.8%
1.7K
total stars
#397
Hiflylabs/awesome-dbt

A curated list of awesome resources for the data transformation tool dbt, focused on analytics engineering.

+202
+14.0%
1.6K
total stars
#398
apache/cloudberry

Open-source massively parallel processing (MPP) database, an alternative to Greenplum.

+202
+20.4%
1.2K
total stars
#399
isar/isar

Extremely fast, easy to use, and fully async NoSQL database for Flutter apps

+200
+5.3%
4.0K
total stars
#400
rogersce/cnpy

A C++ library for reading and writing .npy and .npz files, commonly used in scientific computing.

+200
+15.8%
1.5K
total stars
1...79...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.