Trending Projects

Discover the fastest growing open source projects

Showing 751-800 of 897 trending projects

#751
databricks/LearningSparkV2

This is a book that teaches how to use Apache Spark for lightning-fast data analytics.

+17
+1.3%
1.4K
total stars
#752
shaypal5/awesome-twitter-data

A curated list of Twitter datasets and resources for data scientists and social network analysts.

+17
+1.6%
1.1K
total stars
#753
cayleygraph/cayley

An open-source graph database written in Go, useful for building applications that require linked data and graph-based queries.

+16
+0.1%
15.0K
total stars
#754
psycopg/psycopg2

A Python database adapter for PostgreSQL, allowing developers to interact with their databases.

+16
+0.4%
3.6K
total stars
#755
MaxHalford/prince

A Python library for performing multivariate exploratory data analysis, including techniques like PCA, CA, MCA, MFA, and FAMD.

+16
+1.1%
1.4K
total stars
#756
gaarason/database-all

Eloquent ORM for Java 8, 11, 17, 21, 23 and Spring boot 2.x, 3.x

+16
+1.5%
1.1K
total stars
#757
alibaba/druid

Druid is a high-performance database connection pool for Java applications, designed for monitoring and management.

+15
+0.1%
28.2K
total stars
#758
gonum/plot

A Go library for creating high-quality plots and visualizations of data

+15
+0.5%
2.9K
total stars
#759
npgsql/efcore.pg

Entity Framework Core provider for PostgreSQL, enabling .NET developers to easily interact with PostgreSQL databases.

+15
+0.8%
1.8K
total stars
#760
ResidentMario/geoplot

A high-level geospatial data visualization library for Python developers working with spatial data.

+15
+1.3%
1.2K
total stars
#761
fraunhoferportugal/tsfel

An intuitive library to extract features from time series data for data science and machine learning.

+15
+1.4%
1.1K
total stars
#762
lukes/ISO-3166-Countries-with-Regional-Codes

A comprehensive dataset of ISO 3166-1 country codes and their corresponding UN Geoscheme regional codes, ready to use in various formats.

+14
+0.6%
2.4K
total stars
#763
JuliaPlots/Plots.jl

Powerful plotting and data visualization library for the Julia programming language.

+14
+0.7%
1.9K
total stars
#764
xflr6/graphviz

Simple Python interface for Graphviz, a popular open-source data visualization tool.

+14
+0.8%
1.8K
total stars
#765
mourner/flatbush

A fast spatial index library for 2D points and rectangles in JavaScript, useful for geospatial applications.

+14
+0.9%
1.6K
total stars
#766
percona/percona-xtrabackup

Open source hot backup tool for InnoDB and XtraDB databases

+14
+0.9%
1.5K
total stars
#767
Awesome-Image-Registration-Organization/awesome-image-registration

A curated collection of resources related to image registration, including books, papers, videos, and toolboxes.

+14
+0.9%
1.5K
total stars
#768
NiuTrans/Classical-Modern

A parallel corpus of classical Chinese and modern Chinese texts for language processing and analysis.

+14
+1.0%
1.4K
total stars
#769
toluaina/pgsync

A Python library that syncs data from Postgres to Elasticsearch/OpenSearch, enabling real-time data pipelines.

+14
+1.0%
1.4K
total stars
#770
quarylabs/quary

Open-source BI platform for engineers to explore and model large-scale data pipelines.

+13
+0.6%
2.4K
total stars
#771
s3ql/s3ql

A full-featured file system for online data storage, built with Python.

+13
+1.1%
1.2K
total stars
#772
ddsjoberg/gtsummary

An R package that provides customizable and presentation-ready data summary and analytic result tables.

+13
+1.1%
1.2K
total stars
#773
graphframes/graphframes

GraphFrames provides DataFrame-based Graphs for Apache Spark, enabling scalable graph analysis and algorithms.

+13
+1.2%
1.1K
total stars
#774
axiomhq/hyperloglog

HyperLogLog data structure library with space-efficient sparse and LogLog-Beta implementations.

+13
+1.3%
1.0K
total stars
#775
dpilger26/NumCpp

A C++ implementation of the Python NumPy library for scientific computing and numerical analysis.

+12
+0.3%
3.9K
total stars
#776
The-Japan-DataScientist-Society/100knocks-preprocess

A repository for the 100 Knocks of Data Science Preprocessing, focused on structured data processing.

+12
+0.5%
2.5K
total stars
#777
go-spatial/tegola

Tegola is an open-source Mapbox Vector Tile server written in Go, enabling efficient geospatial data visualization.

+12
+0.8%
1.5K
total stars
#778
r-spatial/sf

An R package that provides support for simple features, a standardized way to encode spatial vector data.

+12
+0.8%
1.4K
total stars
#779
wgzhao/Addax

A fast and versatile ETL tool that can transfer data between RDBMS and NoSQL databases seamlessly

+12
+0.9%
1.4K
total stars
#780
avinassh/py-caskdb

An educational project to build a disk-based key-value store in Python for learning purposes.

+12
+0.9%
1.4K
total stars
#781
spark-examples/pyspark-examples

A collection of PySpark examples covering RDD, DataFrame, and Dataset operations in Python.

+12
+0.9%
1.3K
total stars
#782
pydata/bottleneck

A fast, efficient C extension for NumPy that provides optimized array functions.

+12
+1.0%
1.2K
total stars
#783
aarondl/sqlboiler

SQLBoiler is a Go ORM that generates code tailored to your database schema, making it easy to interact with databases.

+11
+0.2%
7.0K
total stars
#784
sfikas/medical-imaging-datasets

A collection of medical imaging datasets for researchers and developers in the healthcare industry.

+11
+0.4%
2.5K
total stars
#785
dotnet/EntityFramework.Docs

Documentation for the popular .NET ORM Entity Framework Core and Entity Framework 6.

+11
+0.6%
1.7K
total stars
#786
JifuZhao/DS-Take-Home

A collection of data science take-home challenges and solutions implemented in Jupyter Notebooks.

+11
+0.7%
1.7K
total stars
#787
obspy/obspy

A Python toolbox for seismology and seismological observatories, providing tools for data processing and analysis.

+11
+0.9%
1.3K
total stars
#788
jitsucom/jitsu

Open-source data pipeline engine for real-time ETL, connecting data sources to warehouses like BigQuery, Snowflake, Redshift.

+10
+0.2%
4.7K
total stars
#789
ankane/groupdate

A Ruby library that makes it easy to group temporal data, useful for developers working with time-series data.

+10
+0.3%
3.9K
total stars
#790
geekyouth/SZT-bigdata

This is a big data analysis system for the Shenzhen metro with support for various data processing tools.

+10
+0.4%
2.4K
total stars
#791
DQinYuan/chinese_province_city_area_mapper

A Python module for extracting and mapping Chinese province, city, and district data.

+10
+0.6%
1.8K
total stars
#792
imageio/imageio

A Python library for reading and writing a wide range of image and video formats, including DICOM, animated GIFs, and webcam capture.

+10
+0.6%
1.7K
total stars
#793
shuttle-hq/synth

Synth is a Rust library for generating realistic, randomized test data for applications and databases.

+10
+0.7%
1.5K
total stars
#794
Data-Learn/data-engineering

A comprehensive resource for developers to learn and get started with data engineering using Python.

+10
+0.8%
1.3K
total stars
#795
LuxCoreRender/LuxCore

LuxCore is a high-performance path-tracing render engine for realistic 3D graphics and visualization.

+10
+0.8%
1.3K
total stars
#796
tdpetrou/Learn-Pandas

This GitHub repository provides tutorials on effectively using the Pandas library for data analysis.

+10
+0.9%
1.1K
total stars
#797
dblalock/bolt

A fast C++ library for high-performance matrix and vector operations.

+9
+0.4%
2.5K
total stars
#798
mirage/irmin

Irmin is a distributed database that follows the same design principles as Git, allowing for distributed version control of data.

+9
+0.5%
1.9K
total stars
#799
uwdata/arquero

A JavaScript library for efficient querying and transformation of array-backed data tables.

+9
+0.6%
1.5K
total stars
#800
percona/percona-server

Percona Server is an enhanced, open-source version of the MySQL database management system.

+9
+0.7%
1.3K
total stars
1...151718

Stay in the loop

Get weekly updates on trending AI coding tools and projects.