Trending Projects

Discover the fastest growing open source projects

Showing 401-450 of 897 trending projects

#401
DQinYuan/chinese_province_city_area_mapper

A Python module for extracting and mapping Chinese province, city, and district data.

+1
+0.1%
1.8K
total stars
#402
tidyverse/tidyverse

A collection of R packages for data science, including tools for data manipulation, visualization, and modeling.

+1
+0.1%
1.8K
total stars
#403
lh3/bwa

A fast and accurate short-read sequence aligner written in C for genomics applications.

+1
+0.1%
1.7K
total stars
#404
dbt-labs/dbt-utils

Utility functions for dbt projects, a popular data transformation tool for data engineers.

+1
+0.1%
1.7K
total stars
#405
Giorgi/EntityFramework.Exceptions

A .NET Standard library that provides strongly typed exceptions for Entity Framework Core across multiple database providers.

+1
+0.1%
1.7K
total stars
#406
orium/rpds

A Rust library that provides persistent data structures for efficient and immutable data management.

+1
+0.1%
1.7K
total stars
#407
ptyadana/SQL-Data-Analysis-and-Visualization-Projects

This GitHub repository contains SQL data analysis and visualization projects using various tools and databases.

+1
+0.1%
1.7K
total stars
#408
awslabs/open-data-registry

A registry of publicly available datasets hosted on AWS for data-driven developers.

+1
+0.1%
1.7K
total stars
#409
koaning/drawdata

A Python library that allows developers to easily draw datasets within their notebooks.

+1
+0.1%
1.6K
total stars
#410
getdozer/dozer

Dozer is a real-time data movement tool that leverages CDC to move data between various sources and sinks.

+1
+0.1%
1.6K
total stars
#411
mourner/flatbush

A fast spatial index library for 2D points and rectangles in JavaScript, useful for geospatial applications.

+1
+0.1%
1.6K
total stars
#412
capitalone/DataProfiler

A Python library for extracting schema, statistics, and entities from datasets, useful for data profiling and privacy analysis.

+1
+0.1%
1.5K
total stars
#413
polarsignals/frostdb

A fast, embeddable column database written in Go, optimized for AI/ML workloads.

+1
+0.1%
1.5K
total stars
#414
tonbo-io/tonbo

Tonbo is an embedded database for serverless and edge runtimes, optimized for offline-first and big data use cases.

+1
+0.1%
1.5K
total stars
#415
Awesome-Image-Registration-Organization/awesome-image-registration

A curated collection of resources related to image registration, including books, papers, videos, and toolboxes.

+1
+0.1%
1.5K
total stars
#416
san089/goodreads_etl_pipeline

An end-to-end data pipeline for building a data lake, data warehouse, and analytics platform from GoodReads data.

+1
+0.1%
1.5K
total stars
#417
XD-DENG/SQL-exercise

A collection of SQL practice problems for developers to improve their SQL skills.

+1
+0.1%
1.5K
total stars
#418
CodeCutTech/Efficient_Python_tricks_and_tools_for_data_scientists

A collection of efficient Python tricks and tools for data scientists to improve their productivity.

+1
+0.1%
1.5K
total stars
#419
substrait-io/substrait

A cross-platform way to express data transformation, relational algebra, and standardized record expression and plans.

+1
+0.1%
1.5K
total stars
#420
duneanalytics/spellbook

A Python library providing SQL views for Dune Analytics, a popular blockchain data analysis platform.

+1
+0.1%
1.5K
total stars
#421
percona/percona-toolkit

Percona Toolkit is a collection of advanced open source database tools for MySQL, MongoDB, and PostgreSQL.

+1
+0.1%
1.5K
total stars
#422
cloudera/hue

Open source SQL query assistant service for databases and data warehouses

+1
+0.1%
1.4K
total stars
#423
igraph/python-igraph

Python interface for the igraph library, a powerful tool for network analysis and visualization.

+1
+0.1%
1.4K
total stars
#424
orbitinghail/graft

Graft is an open-source transactional storage engine optimized for lazy, partial, and strongly consistent replication, ideal for edge, offline-first, and distributed applications.

+1
+0.1%
1.4K
total stars
#425
distributedio/titan

A distributed, Redis-compatible NoSQL database that provides high performance and scalability.

+1
+0.1%
1.4K
total stars
#426
toluaina/pgsync

A Python library that syncs data from Postgres to Elasticsearch/OpenSearch, enabling real-time data pipelines.

+1
+0.1%
1.4K
total stars
#427
crazyhottommy/getting-started-with-genomics-tools-and-resources

A collection of Unix, R, and Python tools for bioinformatics and data science projects.

+1
+0.1%
1.4K
total stars
#428
PKUJohnson/OpenData

An open-source financial data extraction tool that allows easy API access to web scrape data from various websites.

+1
+0.1%
1.4K
total stars
#429
gtoonstra/etl-with-airflow

This repository provides best practices and examples for building ETL (Extract, Transform, Load) pipelines using Apache Airflow.

+1
+0.1%
1.4K
total stars
#430
Image-Py/imagepy

A Python-based image processing framework with plugins for common image processing libraries.

+1
+0.1%
1.4K
total stars
#431
wainshine/Company-Names-Corpus

A corpus of company names, abbreviations, and brands that can be used for Chinese text segmentation and entity recognition.

+1
+0.1%
1.3K
total stars
#432
elixir-explorer/explorer

A fast and elegant data exploration library for Elixir, providing series and dataframes for data science workflows.

+1
+0.1%
1.3K
total stars
#433
rsvp/fecon235

Notebooks for financial economics, including analyses of Federal Reserve, GDP, inflation, and more.

+1
+0.1%
1.3K
total stars
#434
s3ql/s3ql

A full-featured file system for online data storage, built with Python.

+1
+0.1%
1.2K
total stars
#435
andrewgbruce/statistics-for-data-scientists

This repository provides code and data for a book on statistics for data scientists.

+1
+0.1%
1.2K
total stars
#436
kelvins/municipios-brasileiros

A Python library with data related to Brazilian municipalities, including IBGE codes, latitude, longitude, and more.

+1
+0.1%
1.2K
total stars
#437
TablePlus/DBngin

DBngin is a free, open-source, cross-platform database management tool for developers.

+1
+0.1%
1.2K
total stars
#438
lakekeeper/lakekeeper

Lakekeeper is an open-source, secure, and fast Apache Iceberg REST Catalog written in Rust for data lakehouse governance.

+1
+0.1%
1.2K
total stars
#439
machow/siuba

Python library for using dplyr-like syntax with pandas and SQL databases

+1
+0.1%
1.2K
total stars
#440
opengeos/Awesome-GEE

A curated list of Google Earth Engine resources for geospatial analysis and remote sensing applications.

+1
+0.1%
1.2K
total stars
#441
YuLab-SMU/clusterProfiler

A comprehensive enrichment analysis tool for interpreting omics data, with support for GO, KEGG, and more.

+1
+0.1%
1.2K
total stars
#442
pytroll/satpy

A Python package for processing earth-observing satellite data with support for common data formats and tools.

+1
+0.1%
1.2K
total stars
#443
scikit-bio/scikit-bio

A versatile Python library for bioinformatics, providing data structures, algorithms, and educational resources.

+1
+0.1%
1.2K
total stars
#444
farzaa/gemini-bball

This is a Python library focused on basketball analytics and data processing.

+1
+0.1%
1.2K
total stars
#445
thinh-vu/vnstock

A beginner-friendly Python toolkit for financial data extraction, analysis, and automation.

+1
+0.1%
1.2K
total stars
#446
abhishek-ch/around-dataengineering

A comprehensive knowledge hub for data engineering, machine learning, and MLOps tools and practices.

+1
+0.1%
1.1K
total stars
#447
graphframes/graphframes

GraphFrames provides DataFrame-based Graphs for Apache Spark, enabling scalable graph analysis and algorithms.

+1
+0.1%
1.1K
total stars
#448
rordenlab/dcm2niix

A DICOM to NIfTI converter for medical imaging research and neuroimaging applications.

+1
+0.1%
1.1K
total stars
#449
jblindsay/whitebox-tools

An advanced geospatial data analysis platform for tasks like geomorphology, hydrology, and remote sensing.

+1
+0.1%
1.1K
total stars
#450
OvertureMaps/data

Overture Maps Data is a Python library providing access to open-source geographic data.

+1
+0.1%
1.1K
total stars
1...810...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.