Trending Projects

Discover the fastest growing open source projects

Showing 601-650 of 897 trending projects

#601
cvxgrp/cvxportfolio

A Python library for portfolio optimization and back-testing in finance.

+87
+8.0%
1.2K
total stars
#602
scikit-bio/scikit-bio

A versatile Python library for bioinformatics, providing data structures, algorithms, and educational resources.

+87
+8.1%
1.2K
total stars
#603
san089/goodreads_etl_pipeline

An end-to-end data pipeline for building a data lake, data warehouse, and analytics platform from GoodReads data.

+86
+6.1%
1.5K
total stars
#604
MaxHalford/prince

A Python library for performing multivariate exploratory data analysis, including techniques like PCA, CA, MCA, MFA, and FAMD.

+86
+6.3%
1.4K
total stars
#605
tdpetrou/Learn-Pandas

This GitHub repository provides tutorials on effectively using the Pandas library for data analysis.

+86
+8.3%
1.1K
total stars
#606
facebook/mysql-5.6

This is Facebook's branch of the Oracle MySQL database, including the MyRocks storage engine.

+85
+3.4%
2.6K
total stars
#607
crazyhottommy/getting-started-with-genomics-tools-and-resources

A collection of Unix, R, and Python tools for bioinformatics and data science projects.

+85
+6.6%
1.4K
total stars
#608
snowplow/snowplow

A powerful customer data pipeline for collecting, processing, and analyzing user events and behavior.

+84
+1.2%
7.0K
total stars
#609
huachaohuang/awesome-dbdev

A curated list of awesome materials and resources for database development.

+84
+5.6%
1.6K
total stars
#610
uwdata/arquero

A JavaScript library for efficient querying and transformation of array-backed data tables.

+84
+5.9%
1.5K
total stars
#611
tidwall/btree

A high-performance B-tree implementation for Go, useful for building database-like applications.

+84
+7.5%
1.2K
total stars
#612
schemacrawler/SchemaCrawler

SchemaCrawler is a free database schema discovery and comprehension tool that supports various database management systems.

+83
+4.9%
1.8K
total stars
#613
go-spatial/tegola

Tegola is an open-source Mapbox Vector Tile server written in Go, enabling efficient geospatial data visualization.

+83
+6.0%
1.5K
total stars
#614
Tessil/robin-map

A fast and efficient C++ hash map and hash set implementation using robin hood hashing.

+83
+6.1%
1.4K
total stars
#615
datalevin/datalevin

A simple, fast and versatile Datalog database written in Clojure for vibe coders.

+83
+6.4%
1.4K
total stars
#616
imageio/imageio

A Python library for reading and writing a wide range of image and video formats, including DICOM, animated GIFs, and webcam capture.

+82
+5.1%
1.7K
total stars
#617
eralchemy/eralchemy

A Python tool that generates Entity Relationship Diagrams (ERDs) from SQLAlchemy models.

+82
+6.2%
1.4K
total stars
#618
Cyb3rWard0g/HELK

An open-source threat hunting platform built on the ELK stack for security researchers and analysts.

+81
+2.1%
3.9K
total stars
#619
rordenlab/dcm2niix

A DICOM to NIfTI converter for medical imaging research and neuroimaging applications.

+80
+7.6%
1.1K
total stars
#620
roboyoshi/datacurator-filetree

A standard filetree template for data curation and organization, useful for developers interested in data management.

+79
+5.1%
1.6K
total stars
#621
pysal/pysal

PySAL is a Python Spatial Analysis Library meta-package for geographical data analysis and modeling.

+79
+5.7%
1.5K
total stars
#622
SPLWare/esProc

esProc SPL is a JVM-based programming language for structured data computation, serving as both a data analysis tool and an embedded computing engine.

+78
+1.7%
4.7K
total stars
#623
fonnesbeck/statistical-analysis-python-tutorial

A tutorial for performing statistical data analysis using Python, covering topics like regression, hypothesis testing, and more.

+78
+4.8%
1.7K
total stars
#624
GoogleCloudPlatform/bigquery-utils

Useful scripts, UDFs, views, and other utilities for migration and data warehouse operations in BigQuery.

+78
+6.5%
1.3K
total stars
#625
polarsignals/frostdb

A fast, embeddable column database written in Go, optimized for AI/ML workloads.

+77
+5.4%
1.5K
total stars
#626
tcgoetz/GarminDB

A Python library for downloading, parsing, and analyzing health data from Garmin, FitBit, and MS Health.

+76
+2.6%
2.9K
total stars
#627
spark-examples/pyspark-examples

A collection of PySpark examples covering RDD, DataFrame, and Dataset operations in Python.

+76
+6.0%
1.3K
total stars
#628
jrfiedler/causal_inference_python_code

Python code for causal inference, a book by Miguel Hernán and James Robins.

+75
+5.9%
1.3K
total stars
#629
pachterlab/gget

gget is a Python library that enables efficient querying of genomic reference databases like NCBI, Ensembl, and UniProt.

+75
+7.3%
1.1K
total stars
#630
microsoft/azuredatastudio

Azure Data Studio is a data management and development tool with connectivity to popular cloud and on-premises databases.

+74
+1.0%
7.7K
total stars
#631
timescale/tsbs

A tool for comparing and evaluating databases for time series data.

+74
+5.4%
1.4K
total stars
#632
nalepae/pandarallel

A parallel processing library for Pandas that improves performance on multi-core CPUs.

+73
+1.9%
3.8K
total stars
#633
IRkernel/IRkernel

R kernel for the Jupyter notebook environment, enabling interactive R programming in Jupyter.

+73
+4.5%
1.7K
total stars
#634
SciTools/cartopy

Cartopy is a Python library for creating maps and visualizing spatial data with matplotlib support.

+73
+4.8%
1.6K
total stars
#635
IndrajeetPatil/ggstatsplot

ggstatsplot is an R library that enhances ggplot2 visualizations with statistical analysis and hypothesis testing.

+71
+3.4%
2.2K
total stars
#636
TablePlus/DBngin

DBngin is a free, open-source, cross-platform database management tool for developers.

+71
+6.2%
1.2K
total stars
#637
graphframes/graphframes

GraphFrames provides DataFrame-based Graphs for Apache Spark, enabling scalable graph analysis and algorithms.

+71
+6.7%
1.1K
total stars
#638
fugue-project/fugue

A unified interface for distributed computing on Spark, Dask and Ray without any rewrites.

+70
+3.4%
2.1K
total stars
#639
gobuffalo/pop

A Go ORM and query builder for interacting with databases in Go applications.

+70
+4.9%
1.5K
total stars
#640
xitongsys/parquet-go

A pure Go library for reading and writing Parquet files, a columnar data format.

+70
+5.2%
1.4K
total stars
#641
realm/realm-core

Core database component for the Realm Mobile Database SDKs, a popular NoSQL database for mobile apps.

+70
+7.2%
1.0K
total stars
#642
quarylabs/quary

Open-source BI platform for engineers to explore and model large-scale data pipelines.

+69
+3.0%
2.4K
total stars
#643
chaisql/chai

A modern, embedded SQL database written in Go for embedded and mobile applications.

+69
+4.3%
1.7K
total stars
#644
orlp/slotmap

A Rust data structure for efficiently storing and accessing data in a sparse set.

+69
+5.7%
1.3K
total stars
#645
upper/db

A data access layer (DAL) and ORM-like library for working with SQL and NoSQL databases in Go.

+68
+1.9%
3.6K
total stars
#646
paul-buerkner/brms

R package for Bayesian generalized multivariate non-linear multilevel models using Stan

+68
+5.1%
1.4K
total stars
#647
Data-Learn/data-engineering

A comprehensive resource for developers to learn and get started with data engineering using Python.

+68
+5.5%
1.3K
total stars
#648
neo4j-contrib/neo4j-apoc-procedures

A collection of procedures for the Neo4j graph database, providing advanced graph algorithms and utilities.

+67
+3.8%
1.9K
total stars
#649
Cyan4973/FiniteStateEntropy

A high-performance compression library written in C for developers working with large data sets.

+67
+4.8%
1.5K
total stars
#650
movingpandas/movingpandas

A Python library for analyzing movement trajectory data using GeoPandas.

+67
+5.1%
1.4K
total stars
1...1214...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.