Trending Projects

Discover the fastest growing open source projects

Showing 251-300 of 897 trending projects

#251
cgarciae/pypeln

Concurrent data pipelines in Python for building efficient and scalable data processing workflows.

+144
+9.9%
1.6K
total stars
#252
dbgate/dbgate

Database manager for multiple database engines, runs as desktop or web app.

+143
+2.1%
6.8K
total stars
#253
soedinglab/MMseqs2

MMseqs2 is an ultra-fast and sensitive bioinformatics tool for sequence search and clustering.

+143
+7.7%
2.0K
total stars
#254
apache/hugegraph

A highly scalable, high-performance graph database that supports over 100 billion data points.

+142
+5.0%
3.0K
total stars
#255
taosdata/TDengine

High-performance time-series database for IoT and IIoT

+141
+0.6%
24.8K
total stars
#256
Softmotions/ejdb

EJDB2 is an embeddable JSON database engine with a simple XPath-like query language (JQL) for C/C++ applications.

+141
+10.6%
1.5K
total stars
#257
veb-101/Data-Science-Projects

A collection of data science projects in Python using Jupyter Notebook.

+140
+5.8%
2.6K
total stars
#258
yougov/mongo-connector

MongoDB data stream pipeline tools for managing real-time data synchronization and replication.

+140
+8.1%
1.9K
total stars
#259
karlseguin/the-little-redis-book

A book that teaches the basics of using the Redis in-memory data structure store.

+140
+10.6%
1.5K
total stars
#260
compose/transporter

Transporter is a powerful ETL tool that allows developers to sync data between various persistence engines.

+139
+10.6%
1.4K
total stars
#261
devrimgunduz/pagila

A PostgreSQL sample database for testing and learning SQL queries.

+139
+15.6%
1.0K
total stars
#262
enthought/mayavi

A powerful 3D visualization library for scientific data in Python.

+138
+11.0%
1.4K
total stars
#263
markwk/qs_ledger

A personal data aggregator and analysis tool for self-tracking and quantified self enthusiasts.

+138
+15.0%
1.1K
total stars
#264
tidwall/buntdb

BuntDB is an embeddable, in-memory key/value database for Go with custom indexing and geospatial support.

+137
+2.9%
4.8K
total stars
#265
fortunejs/fortune

Non-native graph database abstraction layer for Node.js and web browsers.

+137
+10.4%
1.5K
total stars
#266
oxnr/awesome-bigdata

A curated list of awesome big data frameworks, resources and other awesomeness.

+136
+1.0%
14.3K
total stars
#267
oceanbase/oceanbase

A fast, scalable, and distributed database for transactional, analytical, and AI workloads.

+136
+1.4%
10.0K
total stars
#268
dathere/qsv

Blazing-fast data wrangling toolkit for AI and data engineering workflows

+136
+4.0%
3.5K
total stars
#269
dingodb/dingo

A high-performance, MySQL-compatible vector database that supports structured and unstructured data for AI-driven applications.

+136
+8.7%
1.7K
total stars
#270
jldbc/pybaseball

A Python library for pulling current and historical baseball statistics, including Statcast, Baseball Reference, and FanGraphs data.

+135
+9.2%
1.6K
total stars
#271
QueryKit/QueryKit

QueryKit is a simple CoreData query language for Swift and Objective-C developers.

+135
+10.2%
1.5K
total stars
#272
bukosabino/ta

Technical Analysis Library using Pandas and Numpy for financial data analysis and trading strategies.

+134
+2.8%
4.9K
total stars
#273
pudo/dataset

Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.

+132
+2.8%
4.9K
total stars
#274
yhat/pandasql

pandasql is a Python library that allows developers to use SQL syntax to query Pandas DataFrames.

+132
+10.8%
1.3K
total stars
#275
eleanorlutz/asteroids_atlas_of_space

This is an astronomy visualization project that maps orbits of asteroids in the solar system.

+132
+11.4%
1.3K
total stars
#276
rqlite/rqlite

A lightweight, fault-tolerant distributed database built on SQLite, designed for high availability.

+131
+0.8%
17.3K
total stars
#277
timescale/tsbs

A tool for comparing and evaluating databases for time series data.

+131
+10.0%
1.4K
total stars
#278
taynaud/python-louvain

A Python library for implementing the Louvain community detection algorithm on graphs.

+131
+14.5%
1.0K
total stars
#279
tikv/tikv

Distributed transactional key-value database, originally created to complement TiDB

+129
+0.8%
16.6K
total stars
#280
eBay/akutan

A distributed knowledge graph store built in Go for managing large-scale semantic data.

+129
+8.4%
1.7K
total stars
#281
CodeCutTech/Efficient_Python_tricks_and_tools_for_data_scientists

A collection of efficient Python tricks and tools for data scientists to improve their productivity.

+128
+9.5%
1.5K
total stars
#282
lukasmartinelli/pgfutter

A tool to easily import CSV and JSON data into PostgreSQL databases.

+128
+10.5%
1.3K
total stars
#283
IQSS/dataverse

Open source research data repository software built with Java.

+128
+14.3%
1.0K
total stars
#284
BrambleXu/pydata-notebook

A collection of Jupyter Notebook files for data analysis using Python, including a Chinese translation of the popular 'Python for Data Analysis' book.

+127
+2.8%
4.7K
total stars
#285
mathesar-foundation/mathesar

An open-source, self-hosted database management tool with a spreadsheet-like interface for Postgres

+126
+2.7%
4.9K
total stars
#286
TomAugspurger/effective-pandas

A collection of articles and source code on using the pandas data analysis library.

+126
+8.7%
1.6K
total stars
#287
bruin-data/bruin

A data platform that enables building data pipelines with SQL, Python, and ingesting from various sources.

+126
+9.6%
1.4K
total stars
#288
wannesm/dtaidistance

A fast C-based implementation of Dynamic Time Warping, a popular algorithm for comparing time series data.

+126
+11.6%
1.2K
total stars
#289
duckdb/ducklake

DuckLake is an integrated data lake and catalog format written in C++.

+125
+5.2%
2.5K
total stars
#290
re-data/re-data

A data quality and observability tool for monitoring and fixing data issues before they become problems.

+125
+8.7%
1.6K
total stars
#291
mining/mining

A Python library for building business intelligence (BI) and OLAP solutions.

+125
+10.8%
1.3K
total stars
#292
bytewax/bytewax

Bytewax is a Python library for building scalable, fault-tolerant, and low-latency data processing pipelines.

+124
+6.8%
2.0K
total stars
#293
Cyan4973/FiniteStateEntropy

A high-performance compression library written in C for developers working with large data sets.

+124
+9.2%
1.5K
total stars
#294
YelpArchive/dataset-examples

Sample datasets for users of the Yelp Academic Dataset, useful for data analysis and machine learning.

+124
+10.9%
1.3K
total stars
#295
duckdb/dbt-duckdb

A dbt adapter for the DuckDB database, enabling developers to build data pipelines and models with dbt.

+124
+11.1%
1.2K
total stars
#296
jblindsay/whitebox-tools

An advanced geospatial data analysis platform for tasks like geomorphology, hydrology, and remote sensing.

+124
+12.5%
1.1K
total stars
#297
PostgresApp/PostgresApp

An open-source PostgreSQL client application for macOS, providing an easy way to set up and manage a local PostgreSQL database.

+123
+1.6%
7.7K
total stars
#298
juliasilge/tidytext

A library for text mining and natural language processing using tidy data principles in R.

+121
+11.2%
1.2K
total stars
#299
bububa/MongoHub-Mac

MongoHub is a native macOS MongoDB client that provides a GUI for managing and interacting with MongoDB databases.

+121
+11.4%
1.2K
total stars
#300
apache/seatunnel

A high-performance, distributed data integration tool for batch, streaming, and CDC use cases.

+120
+1.3%
9.1K
total stars
1...57...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.