Trending Projects

Discover the fastest growing open source projects

Showing 101-150 of 897 trending projects

#101
dbt-labs/dbt-core

dbt enables data analysts and engineers to transform data using software engineering practices.

0
0.0%
12.3K
total stars
#102
mysql/mysql-server

Open-source relational database engine powering web apps, APIs, and data-driven backends worldwide.

0
0.0%
12.2K
total stars
#103
vesoft-inc/nebula

Nebula is a fast, open-source, distributed graph database with horizontal scalability and high availability.

0
0.0%
12.1K
total stars
#104
OpenRefine/OpenRefine

OpenRefine is a powerful data cleaning and transformation tool that helps developers work with messy data.

0
0.0%
11.8K
total stars
#105
TA-Lib/ta-lib-python

Python wrapper for the TA-Lib technical analysis library, useful for financial pattern recognition.

0
0.0%
11.8K
total stars
#106
dhamaniasad/awesome-postgres

A curated list of awesome PostgreSQL software, libraries, tools and resources.

0
0.0%
11.7K
total stars
#107
Tencent/wcdb

WCDB is a cross-platform database framework developed by WeChat for Android, iOS, Linux, macOS, and Windows.

0
0.0%
11.7K
total stars
#108
datahub-project/datahub

An open-source metadata platform for managing your data and AI stack across the enterprise.

0
0.0%
11.6K
total stars
#109
realm/realm-java

Realm is a mobile database that serves as a replacement for SQLite and ORMs.

0
0.0%
11.5K
total stars
#110
StarRocks/starrocks

A high-performance open source query engine for sub-second analytics on data lakehouse.

0
0.0%
11.4K
total stars
#111
statsmodels/statsmodels

Statsmodels is a Python library for statistical modeling and econometrics, providing tools for data analysis and prediction.

0
0.0%
11.3K
total stars
#112
great-expectations/great_expectations

A Python library that helps ensure data quality and reliability through data profiling and testing.

0
0.0%
11.2K
total stars
#113
rougier/scientific-visualization-book

An open-access book on scientific visualization using Python and Matplotlib for data-driven developers

0
0.0%
11.2K
total stars
#114
microsoft/sql-server-samples

This repository contains code samples for SQL Server, Azure SQL, and related data services from Microsoft.

0
0.0%
10.9K
total stars
#115
simonw/datasette

An open-source multi-tool for exploring and publishing data, focused on simplifying data analysis and sharing.

0
0.0%
10.8K
total stars
#116
kedro-org/kedro

Kedro is a Python toolkit for building production-ready data science and machine learning pipelines.

0
0.0%
10.8K
total stars
#117
PRQL/prql

PRQL is a modern, powerful, and pipelined SQL replacement for transforming data.

0
0.0%
10.7K
total stars
#118
pingcap/awesome-database-learning

A comprehensive list of learning materials to help developers understand database internals.

0
0.0%
10.7K
total stars
#119
dicedb/dicedb

DiceDB is an open-source, fast, reactive, in-memory database optimized for modern hardware.

0
0.0%
10.7K
total stars
#120
wangzhiwubigdata/God-Of-BigData

A comprehensive collection of resources and learning materials for big data technologies like Flink, Spark, Hadoop, and Hive.

0
0.0%
10.4K
total stars
#121
modin-project/modin

Modin: Scalable Pandas workflows with a single line of code change, enabling distributed data processing.

0
0.0%
10.4K
total stars
#122
cstack/db_tutorial

A tutorial for writing a SQLite clone from scratch in C, a useful resource for developers building database-backed applications.

0
0.0%
10.3K
total stars
#123
stephencelis/SQLite.swift

A type-safe, Swift-language layer over SQLite3 for building database-backed Swift applications.

0
0.0%
10.1K
total stars
#124
oceanbase/oceanbase

A fast, scalable, and distributed database for transactional, analytical, and AI workloads.

0
0.0%
10.0K
total stars
#125
alexeygrigorev/data-science-interviews

A repository of data science interview questions and answers for developers.

0
0.0%
9.8K
total stars
#126
drivendataorg/cookiecutter-data-science

A flexible and standardized cookiecutter template for doing and sharing data science work in Python.

0
0.0%
9.7K
total stars
#127
doctrine/dbal

A PHP database abstraction layer that provides a simple, consistent API for interacting with different database systems.

0
0.0%
9.7K
total stars
#128
apache/cassandra

Apache Cassandra is a distributed, wide-column store database system designed for high availability, scalability, and performance.

0
0.0%
9.6K
total stars
#129
rapidsai/cudf

A high-performance GPU DataFrame library for data analysis and machine learning workloads.

0
0.0%
9.5K
total stars
#130
litedb-org/LiteDB

LiteDB is a lightweight, embedded NoSQL document database for .NET applications that can be used in a single data file.

0
0.0%
9.4K
total stars
#131
dr5hn/countries-states-cities-database

A comprehensive database of countries, states, and cities with data in multiple formats

0
0.0%
9.3K
total stars
#132
databendlabs/databend

Unified cloud-native data warehouse platform for analytics, search and AI, built on top of S3 storage.

0
0.0%
9.2K
total stars
#133
pymupdf/PyMuPDF

A high-performance Python library for data extraction, analysis, conversion and manipulation of PDF and other documents.

0
0.0%
9.2K
total stars
#134
apache/seatunnel

A high-performance, distributed data integration tool for batch, streaming, and CDC use cases.

0
0.0%
9.1K
total stars
#135
sqlite/sqlite

Official Git mirror of the SQLite source tree, a popular and widely-used embedded database engine.

0
0.0%
9.1K
total stars
#136
mattn/go-sqlite3

A lightweight SQLite3 driver for Go that implements the database/sql interface.

0
0.0%
9.0K
total stars
#137
spacejam/sled

A high-performance, concurrent, embedded key-value database written in Rust for vibe coders.

0
0.0%
8.9K
total stars
#138
ricklamers/gridstudio

Grid Studio is a web-based application for data science with full integration of open source data science frameworks and languages.

0
0.0%
8.9K
total stars
#139
open-metadata/OpenMetadata

A unified metadata platform for data discovery, data observability, and data governance.

0
0.0%
8.8K
total stars
#140
iamseancheney/python_for_data_analysis_2nd_chinese_version

A Chinese translation of a popular book on using Python for data analysis with libraries like pandas and numpy.

0
0.0%
8.8K
total stars
#141
orbitdb/orbitdb

OrbitDB is a peer-to-peer database for the decentralized web, enabling developers to build offline-first, distributed applications.

0
0.0%
8.7K
total stars
#142
alibaba/zvec

Lightning-fast in-process vector DB for RAG & semantic search in C++

0
0.0%
8.7K
total stars
#143
mage-ai/mage-ai

mage-ai is a Python-based platform for building, running, and managing data pipelines and integrating/transforming data.

0
0.0%
8.7K
total stars
#144
delta-io/delta

An open-source data lakehouse framework that enables building data pipelines with leading big data compute engines.

0
0.0%
8.6K
total stars
#145
apache/iceberg

Apache Iceberg is an open-source table format for large analytic datasets, providing a versioned and scalable data lake architecture.

0
0.0%
8.6K
total stars
#146
ideawu/ssdb

SSDB is a fast NoSQL database, an alternative to Redis, with support for leveldb and rocksdb backends.

0
0.0%
8.5K
total stars
#147
apache/beam

Apache Beam is a unified programming model for batch and streaming data processing.

0
0.0%
8.5K
total stars
#148
vaexio/vaex

A high-performance Python library for working with large tabular datasets, offering efficient data manipulation and visualization.

0
0.0%
8.5K
total stars
#149
apache/datafusion

Apache DataFusion is a powerful SQL query engine written in Rust, designed for big data processing and analysis.

0
0.0%
8.5K
total stars
#150
paradedb/paradedb

A Rust-based, Elasticsearch-quality search engine for PostgreSQL, enabling fast, real-time analytics and HTAP use cases.

0
0.0%
8.5K
total stars
124...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.