Trending Projects

Discover the fastest growing open source projects

Showing 801-850 of 897 trending projects

#801

locationtech/geomesa

GeoMesa is a suite of tools for working with big geo-spatial data in a distributed fashion.

+21

+1.4%

1.5K

total stars

Scala

#802

ucarGroup/DataLink

DataLink is a real-time and offline data exchange platform that supports synchronization between heterogeneous data sources.

+21

+1.9%

1.1K

total stars

Java

#803

EliotAndres/kaggle-past-solutions

A searchable compilation of Kaggle past solutions for data science and machine learning developers.

+20

+1.4%

1.5K

total stars

HTML

#804

dask/dask-tutorial

An interactive tutorial for the Dask distributed computing library, focused on data analysis and manipulation.

+19

+1.0%

1.9K

total stars

Jupyter Notebook

#805

cswinter/LocustDB

A blazingly fast analytics database built with Rust, optimized for rapidly devouring large amounts of data.

+19

+1.2%

1.6K

total stars

Rust

#806

cgarciae/pypeln

Concurrent data pipelines in Python for building efficient and scalable data processing workflows.

+19

+1.2%

1.6K

total stars

Python

#807

tensorbase/tensorbase

TensorBase is a new big data warehousing solution built with Rust, focused on high-performance analytics.

+19

+1.3%

1.5K

total stars

Rust

#808

Softmotions/ejdb

EJDB2 is an embeddable JSON database engine with a simple XPath-like query language (JQL) for C/C++ applications.

+18

+1.2%

1.5K

total stars

#809

quiltdata/quilt

Quilt is a data mesh for connecting people with actionable data, built with TypeScript.

+18

+1.3%

1.4K

total stars

TypeScript

#810

mycelial/mycelite

Mycelite is a SQLite extension that enables replication between SQLite instances.

+18

+1.7%

1.1K

total stars

Rust

#811

attic-labs/noms

The versioned, forkable, syncable database for developers who need a scalable, distributed data solution.

+17

+0.2%

7.4K

total stars

#812

wesm/feather

Feather is a fast, interoperable binary data frame storage for Python, R, and more powered by Apache Arrow.

+16

+0.6%

2.8K

total stars

JavaScript

#813

BlankerL/DXY-COVID-19-Data

A data warehouse for COVID-19 time series data, useful for data analysis and visualization.

+16

+0.7%

2.2K

total stars

Python

#814

GiovineItalia/Gadfly.jl

Crafty statistical graphics library for the Julia programming language

+16

+0.8%

1.9K

total stars

Julia

#815

CamDavidsonPilon/lifetimes

A Python library for calculating customer lifetime value metrics and cohort analysis.

+16

+1.1%

1.5K

total stars

Python

#816

karlseguin/the-little-redis-book

A book that teaches the basics of using the Redis in-memory data structure store.

+16

+1.1%

1.5K

total stars

TeX

#817

scijs/ndarray

A JavaScript library for working with multidimensional arrays, useful for data visualization and scientific computing.

+16

+1.3%

1.2K

total stars

JavaScript

#818

datacrypt-project/hitchhiker-tree

A high-performance, persistent, off-heap data structure written in Clojure for data-intensive applications.

+16

+1.3%

1.2K

total stars

Clojure

#819

juliasilge/tidytext

A library for text mining and natural language processing using tidy data principles in R.

+16

+1.4%

1.2K

total stars

#820

brettkromkamp/contextualise

Contextualise is a powerful tool for organizing diverse information resources in knowledge-intensive projects.

+16

+1.5%

1.1K

total stars

Python

#821

filodb/FiloDB

A distributed, scalable Prometheus-compatible time series database written in Scala.

+15

+1.0%

1.5K

total stars

Scala

#822

attaswift/BTree

A fast, in-memory B-tree implementation for sorted collections in Swift.

+15

+1.1%

1.3K

total stars

Swift

#823

schematics/schematics

Python data structures library focused on serialization, deserialization, and validation of complex data schemas.

+14

+0.5%

2.6K

total stars

Python

#824

citusdata/cstore_fdw

A columnar storage extension for Postgres built as a foreign data wrapper.

+14

+0.8%

1.8K

total stars

#825

re-data/re-data

A data quality and observability tool for monitoring and fixing data issues before they become problems.

+14

+0.9%

1.6K

total stars

HTML

#826

pentaho/mondrian

Mondrian is an OLAP server that enables real-time analysis of large data sets for business users.

+14

+1.2%

1.2K

total stars

Java

#827

ricklamers/gridstudio

Grid Studio is a web-based application for data science with full integration of open source data science frameworks and languages.

+13

+0.1%

8.9K

total stars

JavaScript

#828

eveningkid/denodb

A versatile ORM for multiple databases including MySQL, SQLite, MariaDB, PostgreSQL, and MongoDB in Deno.

+13

+0.7%

1.9K

total stars

TypeScript

#829

Intel-bigdata/HiBench

HiBench is a big data benchmark suite for evaluating the performance of different big data frameworks.

+13

+0.9%

1.5K

total stars

Java

#830

slashbase/slashbaseide

Modern database IDE for dev & data workflows, supporting MySQL, PostgreSQL & MongoDB.

+13

+1.0%

1.3K

total stars

TypeScript

#831

prisma/prisma1

Prisma1 is a database toolkit with an ORM, migrations, and admin UI for Postgres, MySQL, and MongoDB.

+12

+0.1%

16.4K

total stars

Scala

#832

thinkaurelius/titan

Titan is a distributed graph database that can be used for building large-scale data-intensive applications.

+12

+0.2%

5.2K

total stars

Java

#833

variety/variety

A MongoDB schema analysis tool that helps developers understand and optimize their NoSQL database.

+12

+0.7%

1.8K

total stars

JavaScript

#834

cmu-db/noisepage

Self-Driving Database Management System from Carnegie Mellon University

+12

+0.7%

1.8K

total stars

C++

#835

cn/GB2260

A Python library for retrieving administrative division codes for China's GB/T 2260 standard.

+12

+0.8%

1.5K

total stars

Python

#836

machow/siuba

Python library for using dplyr-like syntax with pandas and SQL databases

+12

+1.0%

1.2K

total stars

Python

#837

RxSwiftCommunity/RxRealm

A Swift extension for RealmSwift that provides reactive programming support using RxSwift.

+12

+1.0%

1.2K

total stars

Swift

#838

RJT1990/pyflux

Open source time series library for Python, useful for statistical analysis and modeling.

+11

+0.5%

2.1K

total stars

Python

#839

begeekmyfriend/bplustree

A fast B+ tree indexing structure in C for efficient storage and retrieval of billions of key-value pairs.

+11

+0.6%

1.9K

total stars

#840

moby/datakit

Connect processes into powerful data pipelines with a simple git-like filesystem interface

+11

+1.0%

1.1K

total stars

OCaml

#841

rhiever/datacleaner

A Python tool that automatically cleans and preprocesses data for analysis and machine learning.

+11

+1.0%

1.1K

total stars

Python

#842

jitsucom/jitsu

Open-source data pipeline engine for real-time ETL, connecting data sources to warehouses like BigQuery, Snowflake, Redshift.

+10

+0.2%

4.7K

total stars

TypeScript

#843

FeatureBaseDB/featurebase

FeatureBase is a fast analytical database built on bitmaps, perfect for ML and data-intensive applications.

+10

+0.4%

2.5K

total stars

#844

lukasmartinelli/pgfutter

A tool to easily import CSV and JSON data into PostgreSQL databases.

+10

+0.8%

1.3K

total stars

#845

YelpArchive/dataset-examples

Sample datasets for users of the Yelp Academic Dataset, useful for data analysis and machine learning.

+10

+0.8%

1.3K

total stars

Python

#846

scratchdata/scratchdata

A Swiss army knife for big data, enabling seamless integration with popular data warehousing solutions.

+10

+0.9%

1.1K

total stars

#847

mahmoudparsian/data-algorithms-book

This repository provides a comprehensive guide and implementations for data algorithms using MapReduce, Spark, Java, and Scala.

+10

+0.9%

1.1K

total stars

Java

#848

orbitjs/orbit

A composable data framework for building ambitious web applications using TypeScript.

+0.4%

2.3K

total stars

TypeScript

#849

shancarter/mr-data-converter

A JavaScript library that converts CSV and tab-delimited data to web-friendly formats like JSON and XML.

+0.5%

2.0K

total stars

JavaScript

#850

openacid/slim

A space-efficient trie data structure in Go with fast lookup performance.

+0.5%

1.9K

total stars

1...1618

Stay in the loop

Get weekly updates on trending AI coding tools and projects.