Trending Projects

Discover the fastest growing open source projects

Showing 601-650 of 897 trending projects

#601
GeostatsGuy/PythonNumericalDemos

Python demos for spatial data analytics, geostatistics, and machine learning to support courses.

+3
+0.2%
1.5K
total stars
#602
go-spatial/tegola

Tegola is an open-source Mapbox Vector Tile server written in Go, enabling efficient geospatial data visualization.

+3
+0.2%
1.5K
total stars
#603
wgzhao/Addax

A fast and versatile ETL tool that can transfer data between RDBMS and NoSQL databases seamlessly

+3
+0.2%
1.4K
total stars
#604
PKUJohnson/OpenData

An open-source financial data extraction tool that allows easy API access to web scrape data from various websites.

+3
+0.2%
1.4K
total stars
#605
Image-Py/imagepy

A Python-based image processing framework with plugins for common image processing libraries.

+3
+0.2%
1.4K
total stars
#606
attaswift/BTree

A fast, in-memory B-tree implementation for sorted collections in Swift.

+3
+0.2%
1.3K
total stars
#607
alan-turing-institute/CleverCSV

A Python package for handling messy CSV files with improved dialect detection and a command-line interface.

+3
+0.2%
1.3K
total stars
#608
ifsnop/mysqldump-php

A PHP library that provides a MySQL backup functionality, similar to the mysqldump CLI tool.

+3
+0.2%
1.3K
total stars
#609
JetBrains/xodus

Xodus is a transactional, schema-less embedded database used by JetBrains products like YouTrack and Hub.

+3
+0.2%
1.3K
total stars
#610
datacrypt-project/hitchhiker-tree

A high-performance, persistent, off-heap data structure written in Clojure for data-intensive applications.

+3
+0.3%
1.2K
total stars
#611
TablePlus/DBngin

DBngin is a free, open-source, cross-platform database management tool for developers.

+3
+0.3%
1.2K
total stars
#612
bububa/MongoHub-Mac

MongoHub is a native macOS MongoDB client that provides a GUI for managing and interacting with MongoDB databases.

+3
+0.3%
1.2K
total stars
#613
apache/accumulo

Apache Accumulo is a scalable and robust key-value store that provides a sparse, sorted, distributed, and persistent multi-dimensional table.

+3
+0.3%
1.1K
total stars
#614
eduosi/district

This repository contains data on Chinese administrative divisions, including names, pinyin, and codes.

+3
+0.3%
1.1K
total stars
#615
docker-library/mongo

Docker image for the popular MongoDB database, enabling easy deployment and integration with other services.

+3
+0.3%
1.1K
total stars
#616
crazyhottommy/RNA-seq-analysis

This GitHub repository contains notes and code for analyzing RNA-seq data using Python and Snakemake.

+3
+0.3%
1.1K
total stars
#617
gunrock/gunrock

Programmable CUDA/C++ GPU Graph Analytics library for high-performance parallel graph processing.

+3
+0.3%
1.1K
total stars
#618
patx/pickledb

An in-memory key-value store using Python's orjson module for persistence, with SQLite support.

+3
+0.3%
1.1K
total stars
#619
apache/celeborn

Apache Celeborn is a high-performance shuffle and spilled data service for big data applications.

+3
+0.3%
1.0K
total stars
#620
CJ-Chen/TBtools-II

A powerful GUI/CLI tool for biologists to work with NGS data, not a vibe coder tool.

+3
+0.3%
1.0K
total stars
#621
rstudio/pointblank

Data quality assessment and reporting tool for data frames and database tables in R

+3
+0.3%
1.0K
total stars
#622
elliotchance/orderedmap

An ordered map implementation in Go with amortized O(1) performance for common operations.

+3
+0.3%
1.0K
total stars
#623
hazelcast/hazelcast

Hazelcast is a high-performance, distributed in-memory data platform for real-time insights and stream processing.

+2
+0.0%
6.6K
total stars
#624
apache/hive

Apache Hive is a data warehouse software built on top of Apache Hadoop for querying and managing large datasets.

+2
+0.0%
6.0K
total stars
#625
niderhoff/nlp-datasets

A curated list of free/public domain text datasets for natural language processing (NLP) tasks.

+2
+0.0%
6.0K
total stars
#626
kakuilan/china_area_mysql

This is a MySQL library containing China's 5-level administrative regions, not a vibe coder tool.

+2
+0.0%
5.3K
total stars
#627
sripathikrishnan/redis-rdb-tools

A Python tool to parse Redis dump.rdb files, analyze memory usage, and export data to JSON.

+2
+0.0%
5.2K
total stars
#628
tidwall/buntdb

BuntDB is an embeddable, in-memory key/value database for Go with custom indexing and geospatial support.

+2
+0.0%
4.8K
total stars
#629
crate/crate

CrateDB is a distributed, scalable SQL database for storing and analyzing massive amounts of data in near real-time.

+2
+0.1%
4.4K
total stars
#630
ploomber/ploomber

Ploomber is a fast and versatile tool for building and deploying data pipelines that can be used with a variety of AI and ML tools.

+2
+0.1%
3.6K
total stars
#631
WeBankFinTech/DataSphereStudio

DataSphereStudio is a one-stop data application development and management portal covering data exchange, analysis, and visualization.

+2
+0.1%
3.3K
total stars
#632
wesm/feather

Feather is a fast, interoperable binary data frame storage for Python, R, and more powered by Apache Arrow.

+2
+0.1%
2.8K
total stars
#633
schematics/schematics

Python data structures library focused on serialization, deserialization, and validation of complex data schemas.

+2
+0.1%
2.6K
total stars
#634
youngyangyang04/Skiplist-CPP

A lightweight key-value store built with C++ using a skiplist data structure.

+2
+0.1%
2.4K
total stars
#635
chezou/tabula-py

A simple Python wrapper for the Tabula Java library, which extracts tables from PDF files into Pandas DataFrames.

+2
+0.1%
2.3K
total stars
#636
enhancedformysql/The-Art-of-Problem-Solving-in-Software-Engineering_How-to-Make-MySQL-Better

This repository provides a comprehensive guide on optimizing MySQL performance and solving common database problems.

+2
+0.1%
1.9K
total stars
#637
plant99/felicette

A Python library for processing and visualizing satellite imagery data.

+2
+0.1%
1.8K
total stars
#638
Kyubyong/numpy_exercises

A repository of NumPy exercises for developers looking to improve their Python and data manipulation skills.

+2
+0.1%
1.7K
total stars
#639
JifuZhao/DS-Take-Home

A collection of data science take-home challenges and solutions implemented in Jupyter Notebooks.

+2
+0.1%
1.7K
total stars
#640
dingodb/dingo

A high-performance, MySQL-compatible vector database that supports structured and unstructured data for AI-driven applications.

+2
+0.1%
1.7K
total stars
#641
aws-samples/aws-glue-samples

AWS Glue code samples for building data integration and ETL pipelines on AWS.

+2
+0.1%
1.5K
total stars
#642
locationtech/geomesa

GeoMesa is a suite of tools for working with big geo-spatial data in a distributed fashion.

+2
+0.1%
1.5K
total stars
#643
shuttle-hq/synth

Synth is a Rust library for generating realistic, randomized test data for applications and databases.

+2
+0.1%
1.5K
total stars
#644
CamDavidsonPilon/lifetimes

A Python library for calculating customer lifetime value metrics and cohort analysis.

+2
+0.1%
1.5K
total stars
#645
Tessil/robin-map

A fast and efficient C++ hash map and hash set implementation using robin hood hashing.

+2
+0.1%
1.4K
total stars
#646
sfirke/janitor

A collection of simple tools for data cleaning and wrangling in R for data science tasks.

+2
+0.1%
1.4K
total stars
#647
tidyverse/tidyr

tidyr is an R package that provides a set of functions to tidy messy data into a format suitable for analysis.

+2
+0.1%
1.4K
total stars
#648
r-spatial/sf

An R package that provides support for simple features, a standardized way to encode spatial vector data.

+2
+0.1%
1.4K
total stars
#649
PumpkinDB/PumpkinDB

PumpkinDB is an immutable, ordered key-value database engine written in Rust.

+2
+0.1%
1.4K
total stars
#650
PyTables/PyTables

A powerful Python package to manage and work with extremely large amounts of data.

+2
+0.1%
1.4K
total stars
1...1214...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.