Trending Projects

Discover the fastest growing open source projects

Showing 301-350 of 897 trending projects

#301
sql-js/sql.js

A JavaScript library that allows you to run SQLite on the web, enabling local database functionality for web apps.

+141
+1.1%
13.6K
total stars
#302
cstack/db_tutorial

A tutorial for writing a SQLite clone from scratch in C, a useful resource for developers building database-backed applications.

+141
+1.4%
10.3K
total stars
#303
skyzh/mini-lsm

A Rust-based implementation of an LSM-Tree storage engine (database) for developers to build and learn from.

+141
+3.7%
3.9K
total stars
#304
skfolio/skfolio

A Python library for portfolio optimization using scikit-learn and convex optimization techniques.

+141
+8.1%
1.9K
total stars
#305
dask/dask-tutorial

An interactive tutorial for the Dask distributed computing library, focused on data analysis and manipulation.

+141
+8.2%
1.9K
total stars
#306
sequelize/sequelize

ORM for Node.js/TypeScript with multiple database support

+140
+0.5%
30.3K
total stars
#307
plotters-rs/plotters

A high-quality, cross-platform data plotting library for Rust developers, including WebAssembly support.

+140
+3.2%
4.5K
total stars
#308
narwhals-dev/narwhals

Lightweight and extensible compatibility layer between popular dataframe libraries like Pandas, Dask, and PySpark.

+140
+9.9%
1.5K
total stars
#309
QueryKit/QueryKit

QueryKit is a simple CoreData query language for Swift and Objective-C developers.

+140
+10.6%
1.5K
total stars
#310
compose/transporter

Transporter is a powerful ETL tool that allows developers to sync data between various persistence engines.

+139
+10.6%
1.4K
total stars
#311
huangzhibiao/BGFMDB

A simple Objective-C library that provides a one-line CRUD interface for SQLite databases on iOS.

+139
+10.6%
1.4K
total stars
#312
attaswift/BTree

A fast, in-memory B-tree implementation for sorted collections in Swift.

+138
+11.6%
1.3K
total stars
#313
gtoonstra/etl-with-airflow

This repository provides best practices and examples for building ETL (Extract, Transform, Load) pipelines using Apache Airflow.

+137
+11.3%
1.4K
total stars
#314
JasonKessler/scattertext

A Python library for creating beautiful visualizations of language differences across document types.

+136
+6.2%
2.3K
total stars
#315
PyTables/PyTables

A powerful Python package to manage and work with extremely large amounts of data.

+135
+11.0%
1.4K
total stars
#316
ClickHouse/clickhouse-go

A Go driver for the ClickHouse analytics database, enabling fast and efficient data processing.

+133
+4.3%
3.3K
total stars
#317
timeplus-io/proton

Fast, single-binary C++ SQL ETL pipeline for stream processing, observability, analytics, and AI/ML.

+133
+6.6%
2.2K
total stars
#318
apache/parquet-format

Apache Parquet Format, a columnar data storage format used in the Apache Hadoop ecosystem.

+132
+6.2%
2.3K
total stars
#319
cozodb/cozo

A transactional, relational-graph-vector database that uses Datalog for query, designed for AI and ML use cases.

+131
+3.5%
3.9K
total stars
#320
dbt-labs/metricflow

MetricFlow allows developers to define, build, and maintain metrics in code for business intelligence and analytics.

+131
+9.5%
1.5K
total stars
#321
heibaiying/BigData-Notes

A comprehensive guide to big data technologies like Hadoop, Spark, Kafka, and more for developers.

+130
+0.8%
16.9K
total stars
#322
rogersce/cnpy

A C++ library for reading and writing .npy and .npz files, commonly used in scientific computing.

+130
+9.7%
1.5K
total stars
#323
msiemens/tinydb

A lightweight, document-oriented database optimized for happiness, used as a Python library or CLI.

+129
+1.8%
7.5K
total stars
#324
pyvista/pyvista

A Python library for 3D plotting and mesh analysis using the Visualization Toolkit (VTK)

+129
+3.8%
3.6K
total stars
#325
kblin/ncbi-genome-download

Scripts to download genomes from the NCBI FTP servers for bioinformatics and genomics research.

+128
+13.7%
1.1K
total stars
#326
collabH/bigdata-growth

A comprehensive repository covering big data knowledge, including data warehouse modeling, real-time computing, Hadoop, Spark, and more.

+125
+7.8%
1.7K
total stars
#327
slashbase/slashbaseide

Modern database IDE for dev & data workflows, supporting MySQL, PostgreSQL & MongoDB.

+125
+10.5%
1.3K
total stars
#328
uber-archive/AthenaX

A scalable, SQL-based streaming analytics platform from Uber, built on top of Apache Flink.

+124
+11.3%
1.2K
total stars
#329
datacrypt-project/hitchhiker-tree

A high-performance, persistent, off-heap data structure written in Clojure for data-intensive applications.

+124
+11.4%
1.2K
total stars
#330
mybatis/mybatis-3

MyBatis SQL Mapper for Java simplifies database interactions with object mapping.

+123
+0.6%
20.4K
total stars
#331
OpenRefine/OpenRefine

OpenRefine is a powerful data cleaning and transformation tool that helps developers work with messy data.

+123
+1.1%
11.8K
total stars
#332
mage-ai/mage-ai

mage-ai is a Python-based platform for building, running, and managing data pipelines and integrating/transforming data.

+123
+1.4%
8.7K
total stars
#333
CamDavidsonPilon/lifelines

A Python library for survival analysis, useful for developers working with time-to-event data.

+123
+5.0%
2.6K
total stars
#334
apache/beam

Apache Beam is a unified programming model for batch and streaming data processing.

+122
+1.5%
8.5K
total stars
#335
JetBrains/xodus

Xodus is a transactional, schema-less embedded database used by JetBrains products like YouTrack and Hub.

+121
+10.7%
1.3K
total stars
#336
pomber/covid19

A public dataset of daily COVID-19 cases and deaths per country, useful for data analysis and visualization.

+121
+11.0%
1.2K
total stars
#337
bububa/MongoHub-Mac

MongoHub is a native macOS MongoDB client that provides a GUI for managing and interacting with MongoDB databases.

+121
+11.4%
1.2K
total stars
#338
pentaho/pentaho-kettle

Pentaho Data Integration (ETL) is a Java-based tool for building data integration and ETL pipelines.

+120
+1.5%
8.3K
total stars
#339
vlcn-io/cr-sqlite

A Rust library that provides multi-writer and CRDT support for SQLite databases.

+120
+3.4%
3.6K
total stars
#340
alexkay/spek

An acoustic spectrum analyzer library written in C++ for audio analysis and visualization.

+120
+3.9%
3.2K
total stars
#341
sfirke/janitor

A collection of simple tools for data cleaning and wrangling in R for data science tasks.

+120
+9.1%
1.4K
total stars
#342
marsupialtail/quokka

A scalable, distributed ETL framework for building data lake analytics pipelines.

+120
+11.2%
1.2K
total stars
#343
typicode/lowdb

Lightweight local JSON database for JavaScript/TypeScript apps

+119
+0.5%
22.5K
total stars
#344
sacridini/Awesome-Geospatial

A comprehensive collection of geospatial tools and resources for data analysis, machine learning, and spatial applications.

+119
+2.5%
4.8K
total stars
#345
frectonz/sql-studio

A SQL database explorer supporting multiple database engines like SQLite, PostgreSQL, and MySQL.

+116
+3.4%
3.5K
total stars
#346
amphi-ai/amphi-etl

A visual data preparation tool powered by Python, designed for data analysis and ETL tasks.

+116
+9.4%
1.4K
total stars
#347
apachecn/spark-doc-zh

This repository provides the official Apache Spark documentation in Chinese, a popular big data processing framework.

+115
+10.8%
1.2K
total stars
#348
golang/leveldb

The LevelDB key-value database in the Go programming language.

+115
+11.1%
1.2K
total stars
#349
delta-io/delta-rs

A Rust library for interacting with Delta Lake, a data lake storage format, with Python bindings.

+114
+3.7%
3.2K
total stars
#350
youngwookim/awesome-hadoop

A curated list of resources for the Hadoop ecosystem, not a developer discovery platform focused on vibe coders.

+113
+11.3%
1.1K
total stars
1...68...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.