Trending Projects

Discover the fastest growing open source projects

Showing 201-250 of 897 trending projects

#201
ujjwalkarn/DataSciencePython

A Python library for common data analysis and machine learning tasks

0
0.0%
5.7K
total stars
#202
owid/covid-19-data

COVID-19 data repository for developers, providing daily updated case, death, and testing information.

0
0.0%
5.7K
total stars
#203
apache/hbase

Apache HBase is a distributed, scalable, fault-tolerant database for large datasets built on top of HDFS.

0
0.0%
5.6K
total stars
#204
airbnb/knowledge-repo

A next-generation curated knowledge sharing platform for data scientists and other technical professionals.

0
0.0%
5.5K
total stars
#205
youssefHosni/Data-Science-Interview-Questions-Answers

A curated list of data science interview questions and answers for developers.

0
0.0%
5.5K
total stars
#206
PyPortfolio/PyPortfolioOpt

A Python library for financial portfolio optimization, including classical efficient frontier and advanced techniques.

0
0.0%
5.5K
total stars
#207
lux-org/lux

Automatically visualize your pandas dataframes with a single print command, enabling quick EDA.

0
0.0%
5.4K
total stars
#208
kakuilan/china_area_mysql

This is a MySQL library containing China's 5-level administrative regions, not a vibe coder tool.

0
0.0%
5.3K
total stars
#209
dunwu/db-tutorial

An in-depth tutorial covering mainstream database knowledge for backend developers.

0
0.0%
5.3K
total stars
#210
cube2222/octosql

OctoSQL is a powerful SQL query tool that allows you to join, analyze, and transform data from multiple databases and file formats.

0
0.0%
5.2K
total stars
#211
thinkaurelius/titan

Titan is a distributed graph database that can be used for building large-scale data-intensive applications.

0
0.0%
5.2K
total stars
#212
JoshClose/CsvHelper

A C# library for reading and writing CSV files, with support for a wide range of CSV file formats.

0
0.0%
5.2K
total stars
#213
sripathikrishnan/redis-rdb-tools

A Python tool to parse Redis dump.rdb files, analyze memory usage, and export data to JSON.

0
0.0%
5.2K
total stars
#214
treeverse/lakeFS

lakeFS is a Git-like version control system for data lakes, enabling data engineers to manage data versioning and data quality.

0
0.0%
5.2K
total stars
#215
fluvio-community/fluvio

Fluvio is an event stream processing engine for developers to build responsive data-intensive apps.

0
0.0%
5.2K
total stars
#216
jeremyevans/sequel

Sequel is a Ruby library that provides a powerful and flexible object-relational mapping (ORM) for databases.

0
0.0%
5.1K
total stars
#217
TurboWay/bigdata_analyse

This is a Python project for big data analysis, focusing on HQL, SQL, and data processing.

0
0.0%
5.0K
total stars
#218
mgramin/awesome-db-tools

A curated list of awesome database tools and resources to make working with databases easier.

0
0.0%
5.0K
total stars
#219
tidyverse/dplyr

dplyr is a powerful R library for data manipulation, providing a grammar of data manipulation.

0
0.0%
5.0K
total stars
#220
dlt-hub/dlt

An open-source Python library that simplifies the process of loading data into data lakes and warehouses.

0
0.0%
5.0K
total stars
#221
orientechnologies/orientdb

OrientDB is a versatile, multi-model DBMS that supports Graph, Document, Reactive, Full-Text, and Geospatial models.

0
0.0%
4.9K
total stars
#222
deepseek-ai/smallpond

A lightweight data processing framework built on DuckDB and 3FS for vibe coders working with AI tools.

0
0.0%
4.9K
total stars
#223
biopython/biopython

Biopython is a set of Python modules that provide a wide range of functionality for bioinformatics, including DNA/RNA/protein sequence analysis, phylogenetics, and more.

0
0.0%
4.9K
total stars
#224
bukosabino/ta

Technical Analysis Library using Pandas and Numpy for financial data analysis and trading strategies.

0
0.0%
4.9K
total stars
#225
cmu-db/bustub

An educational relational database management system (RDBMS) implementation in C++.

0
0.0%
4.9K
total stars
#226
rosedblabs/rosedb

Lightweight, fast, and reliable key-value database engine in Go for high-throughput applications.

0
0.0%
4.9K
total stars
#227
mathesar-foundation/mathesar

An open-source, self-hosted database management tool with a spreadsheet-like interface for Postgres

0
0.0%
4.9K
total stars
#228
pudo/dataset

Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.

0
0.0%
4.9K
total stars
#229
tidwall/buntdb

BuntDB is an embeddable, in-memory key/value database for Go with custom indexing and geospatial support.

0
0.0%
4.8K
total stars
#230
alandefreitas/matplotplusplus

Matplot++: A C++ graphics library for creating high-quality data visualizations and scientific plots.

0
0.0%
4.8K
total stars
#231
GoogleTrends/data

An open-source index of Google Trends data, useful for developers building data-driven applications.

0
0.0%
4.8K
total stars
#232
lk-geimfari/mimesis

Mimesis is a fast Python library for generating fake data in multiple languages for testing and development purposes.

0
0.0%
4.8K
total stars
#233
sacridini/Awesome-Geospatial

A comprehensive collection of geospatial tools and resources for data analysis, machine learning, and spatial applications.

0
0.0%
4.8K
total stars
#234
amundsen-io/amundsen

Amundsen is an open-source data discovery platform for improving productivity of data analysts and engineers.

0
0.0%
4.7K
total stars
#235
datawhalechina/competition-baseline

A collection of code examples and baselines for common data science and machine learning competitions.

0
0.0%
4.7K
total stars
#236
liam-hq/liam

Automatically generates beautiful and easy-to-read ER diagrams from your database.

0
0.0%
4.7K
total stars
#237
dbeaver/cloudbeaver

Cloud-based database manager UI for querying, managing, and visualizing databases across multiple platforms.

0
0.0%
4.7K
total stars
#238
ydb-platform/ydb

An open-source distributed SQL database with high availability, scalability, and ACID transactions.

0
0.0%
4.7K
total stars
#239
SPLWare/esProc

esProc SPL is a JVM-based programming language for structured data computation, serving as both a data analysis tool and an embedded computing engine.

0
0.0%
4.7K
total stars
#240
jitsucom/jitsu

Open-source data pipeline engine for real-time ETL, connecting data sources to warehouses like BigQuery, Snowflake, Redshift.

0
0.0%
4.7K
total stars
#241
BrambleXu/pydata-notebook

A collection of Jupyter Notebook files for data analysis using Python, including a Chinese translation of the popular 'Python for Data Analysis' book.

0
0.0%
4.7K
total stars
#242
nalgeon/redka

A Redis-compatible database implemented in Go, supporting SQL and multiple backends like PostgreSQL and SQLite.

0
0.0%
4.5K
total stars
#243
plotters-rs/plotters

A high-quality, cross-platform data plotting library for Rust developers, including WebAssembly support.

0
0.0%
4.5K
total stars
#244
hugo2046/QuantsPlaybook

A quantitative research and stock analysis platform for finance professionals.

0
0.0%
4.5K
total stars
#245
theOehrly/Fast-F1

A Python package for accessing and analyzing Formula 1 racing data, including results, schedules, timing, and telemetry.

0
0.0%
4.5K
total stars
#246
has2k1/plotnine

A grammar of graphics library for creating highly customizable and publication-quality plots in Python.

0
0.0%
4.5K
total stars
#247
deanmalmgren/textract

A Python library that provides a simple and unified interface for extracting text from any document format.

0
0.0%
4.5K
total stars
#248
dedupeio/dedupe

A Python library for accurate and scalable fuzzy matching, record deduplication, and entity resolution.

0
0.0%
4.4K
total stars
#249
CLUEbenchmark/CLUEDatasetSearch

A comprehensive search tool for finding Chinese NLP datasets, with support for common English NLP datasets as well.

0
0.0%
4.4K
total stars
#250
crate/crate

CrateDB is a distributed, scalable SQL database for storing and analyzing massive amounts of data in near real-time.

0
0.0%
4.4K
total stars
1...46...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.