Trending Projects

Discover the fastest growing open source projects

Showing 851-897 of 897 trending projects

#851
the-pudding/data

A repository of open-source data sets created for stories on The Pudding, a digital publication focused on data journalism.

0
0.0%
1.1K
total stars
#852
hail-is/hail

Cloud-native genomic dataframes and batch computing for bioinformatics and genetics research.

0
0.0%
1.1K
total stars
#853
moshi4/pyCirclize

A Python library for creating circular data visualizations like Circos plots, chord diagrams, and radar charts.

0
0.0%
1.1K
total stars
#854
apache/phoenix

Apache Phoenix is a scalable, distributed SQL engine that connects to HBase for low-latency queries.

0
0.0%
1.1K
total stars
#855
realm/realm-core

Core database component for the Realm Mobile Database SDKs, a popular NoSQL database for mobile apps.

0
0.0%
1.0K
total stars
#856
bigdatagenomics/adam

ADAM is a genomics analysis platform with specialized file formats built using Apache Spark and Apache Parquet.

0
0.0%
1.0K
total stars
#857
J535D165/recordlinkage

A powerful Python library for record linkage and duplicate detection in data-driven applications.

0
0.0%
1.0K
total stars
#858
josonle/Coding-Now

A collection of study notes, ebooks, and resources on big data, machine learning, Linux, and more for developers.

0
0.0%
1.0K
total stars
#859
rgeo/rgeo

A geospatial data library for Ruby that provides a set of tools for working with geographic data.

0
0.0%
1.0K
total stars
#860
hannorein/rebound

An open-source N-body simulation library for astrophysics and planetary science.

0
0.0%
1.0K
total stars
#861
cuge1995/awesome-time-series

A curated list of resources for time series forecasting, including papers, code, and other materials.

0
0.0%
1.0K
total stars
#862
google/cluster-data

This is a dataset of Borg cluster traces from Google, which can be useful for researchers and developers in the field of distributed systems and cloud infrastructure.

0
0.0%
1.0K
total stars
#863
pixiedust/pixiedust

A Python helper library for enhancing Jupyter Notebooks with data visualization and analysis capabilities.

0
0.0%
1.0K
total stars
#864
apache/celeborn

Apache Celeborn is a high-performance shuffle and spilled data service for big data applications.

0
0.0%
1.0K
total stars
#865
avehtari/BDA_py_demos

Provides Bayesian data analysis demos in Python for developers interested in probabilistic modeling.

0
0.0%
1.0K
total stars
#866
facebookresearch/cc_net

Tools to download and cleanup Common Crawl data, a large web crawl dataset, for further analysis and processing.

0
0.0%
1.0K
total stars
#867
LAStools/LAStools

This repository contains efficient tools for LiDAR processing, focused on working with point cloud data.

0
0.0%
1.0K
total stars
#868
taynaud/python-louvain

A Python library for implementing the Louvain community detection algorithm on graphs.

0
0.0%
1.0K
total stars
#869
TIBCOSoftware/snappydata

SnappyData is a memory-optimized analytics database based on Apache Spark and Apache Geode, enabling real-time stream processing, transactions, and predictive analytics.

0
0.0%
1.0K
total stars
#870
bashtage/linearmodels

This Python library provides additional linear models for statistical modeling and analysis.

0
0.0%
1.0K
total stars
#871
inloop/sqlite-viewer

A simple SQLite file viewer that allows you to view and explore SQLite databases online.

0
0.0%
1.0K
total stars
#872
Kotlin/dataframe

A Kotlin library for structured data processing, suitable for data analysis and data science tasks.

0
0.0%
1.0K
total stars
#873
devrimgunduz/pagila

A PostgreSQL sample database for testing and learning SQL queries.

0
0.0%
1.0K
total stars
#874
CJ-Chen/TBtools-II

A powerful GUI/CLI tool for biologists to work with NGS data, not a vibe coder tool.

0
0.0%
1.0K
total stars
#875
axiomhq/hyperloglog

HyperLogLog data structure library with space-efficient sparse and LogLog-Beta implementations.

0
0.0%
1.0K
total stars
#876
dataprofessor/code

Compilation of R and Python programming codes for data science and machine learning projects.

0
0.0%
1.0K
total stars
#877
tidyverse/readr

A fast and flexible R package for reading flat files (CSV, TSV, fixed-width) into R data frames.

0
0.0%
1.0K
total stars
#878
IQSS/dataverse

Open source research data repository software built with Java.

0
0.0%
1.0K
total stars
#879
shaiwz/data-platform-open

A no-code, visual data integration platform for building big data pipelines and workflows.

0
0.0%
1.0K
total stars
#880
cyang-kth/fmm

An open-source C++ framework for fast and parallel map matching of GPS trajectories.

0
0.0%
1.0K
total stars
#881
opengeospatial/geoparquet

A specification for storing geospatial vector data (point, line, polygon) in the Parquet file format, enabling efficient cloud-native geospatial data processing.

0
0.0%
1.0K
total stars
#882
rstudio/pointblank

Data quality assessment and reporting tool for data frames and database tables in R

0
0.0%
1.0K
total stars
#883
twosigma/flint

A time series library for Apache Spark that provides a high-level API for working with time series data.

0
0.0%
1.0K
total stars
#884
OHDSI/CommonDataModel

A definition and DDLs for the OMOP Common Data Model (CDM), a data model for healthcare data.

0
0.0%
1.0K
total stars
#885
scylladb/gocqlx

A comprehensive Go library for working with Cassandra/Scylla databases, providing a query builder, ORM, and migration tool.

0
0.0%
1.0K
total stars
#886
allenai/s2orc

A large-scale open-access corpus of scientific papers and metadata for researchers and developers.

0
0.0%
1.0K
total stars
#887
elliotchance/orderedmap

An ordered map implementation in Go with amortized O(1) performance for common operations.

0
0.0%
1.0K
total stars
#888
topling/toplingdb

ToplingDB is a cloud-native, distributed, and searchable key-value store built on RocksDB.

0
0.0%
1.0K
total stars
#889
lacuna/bifurcan

A library of functional, durable data structures written in Java for developers building robust applications.

0
0.0%
1.0K
total stars
#890
mysql/mysql-connector-j

MySQL Connector/J is a JDBC driver that enables Java applications to connect to MySQL databases.

0
0.0%
1.0K
total stars
#891
1eez/103976

A comprehensive English word database with translations, parts of speech, and definitions for developers.

0
0.0%
1.0K
total stars
#892
sentinelsat/sentinelsat

A Python library for searching and downloading Copernicus Sentinel satellite images for geographic data analysis.

0
0.0%
1.0K
total stars
#893
opengeos/streamlit-geospatial

A multi-page Streamlit app for geospatial data visualization and analysis, useful for housing and real estate applications.

0
0.0%
1.0K
total stars
#894
efficient/cuckoofilter

A space-efficient C++ implementation of the Cuckoo filter, a probabilistic data structure for set membership testing.

0
0.0%
1.0K
total stars
#895
blaze/odo

A Python library for data migration and transformation in the Blaze project.

0
0.0%
1.0K
total stars
#896
SciRuby/sciruby

SciRuby provides a collection of tools for scientific computation in Ruby, catering to developers working with data and scientific applications.

0
0.0%
1.0K
total stars
#897
shencangsheng/easydb_app

EasyDB is a lightweight desktop app that lets you query local CSV, Excel, and JSON files with SQL, without an external database.

0
0.0%
995
total stars
1...17

Stay in the loop

Get weekly updates on trending AI coding tools and projects.