Trending Projects

Discover the fastest growing open source projects

Showing 751-800 of 897 trending projects

#751
apache/cloudberry

Open-source massively parallel processing (MPP) database, an alternative to Greenplum.

0
0.0%
1.2K
total stars
#752
JuliaStats/Distributions.jl

A comprehensive Julia library for probability distributions and related statistical functions.

0
0.0%
1.2K
total stars
#753
DaveSkender/Stock.Indicators

A C# NuGet package that provides technical indicators and trading insights for financial market data analysis.

0
0.0%
1.2K
total stars
#754
apachecn/spark-doc-zh

This repository provides the official Apache Spark documentation in Chinese, a popular big data processing framework.

0
0.0%
1.2K
total stars
#755
bububa/MongoHub-Mac

MongoHub is a native macOS MongoDB client that provides a GUI for managing and interacting with MongoDB databases.

0
0.0%
1.2K
total stars
#756
machow/siuba

Python library for using dplyr-like syntax with pandas and SQL databases

0
0.0%
1.2K
total stars
#757
eventql/eventql

Distributed, massively parallel SQL query engine for big data analytics and timeseries workloads.

0
0.0%
1.2K
total stars
#758
PoloDB/PoloDB

PoloDB is an embedded document database written in Rust for building cross-platform, local-first applications.

0
0.0%
1.2K
total stars
#759
ddsjoberg/gtsummary

An R package that provides customizable and presentation-ready data summary and analytic result tables.

0
0.0%
1.2K
total stars
#760
apache/ozone

Scalable, reliable, distributed storage system optimized for data analytics and object store workloads.

0
0.0%
1.2K
total stars
#761
8080labs/ppscore

A Python library that provides a Predictive Power Score (PPS) to measure the predictive power between variables.

0
0.0%
1.2K
total stars
#762
xiaoxu193/PyTeaser

A Python library that summarizes news articles by extracting the most important sentences.

0
0.0%
1.2K
total stars
#763
pydata/bottleneck

A fast, efficient C extension for NumPy that provides optimized array functions.

0
0.0%
1.2K
total stars
#764
RxSwiftCommunity/RxRealm

A Swift extension for RealmSwift that provides reactive programming support using RxSwift.

0
0.0%
1.2K
total stars
#765
spatie/db-dumper

A PHP library for dumping the contents of a database to a file, supporting multiple database engines.

0
0.0%
1.2K
total stars
#766
apache/incubator-xtable

Apache XTable is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.

0
0.0%
1.2K
total stars
#767
robjhyndman/forecast

A time series forecasting library for R, providing a wide range of models and tools for accurate predictions.

0
0.0%
1.2K
total stars
#768
datasets/covid-19

This GitHub repository provides time series data on COVID-19 cases, useful for data analysis and visualization.

0
0.0%
1.2K
total stars
#769
golang/leveldb

The LevelDB key-value database in the Go programming language.

0
0.0%
1.2K
total stars
#770
MarcosMeli/FileHelpers

A free and easy-to-use .NET library for reading and writing CSV and fixed-length data files.

0
0.0%
1.2K
total stars
#771
easystats/easystats

An R project focused on providing high-performance statistical models, data analysis, and visualization tools.

0
0.0%
1.1K
total stars
#772
GeospatialPython/pyshp

A pure Python library for reading and writing ESRI Shapefiles, a popular geospatial data format.

0
0.0%
1.1K
total stars
#773
samayo/country-json

A simple JSON data set of country information, useful for building apps that need country data.

0
0.0%
1.1K
total stars
#774
petewarden/dstk

A collection of open data sets and tools for data science and machine learning tasks.

0
0.0%
1.1K
total stars
#775
apache/accumulo

Apache Accumulo is a scalable and robust key-value store that provides a sparse, sorted, distributed, and persistent multi-dimensional table.

0
0.0%
1.1K
total stars
#776
lvgalvao/data-engineering-roadmap

Comprehensive roadmap for data engineering and AI development in Python

0
0.0%
1.1K
total stars
#777
ucarGroup/DataLink

DataLink is a real-time and offline data exchange platform that supports synchronization between heterogeneous data sources.

0
0.0%
1.1K
total stars
#778
alecthw/mmdb_china_ip_list

A library for generating MaxMind GeoIP2 databases for China IP addresses.

0
0.0%
1.1K
total stars
#779
tdpetrou/Learn-Pandas

This GitHub repository provides tutorials on effectively using the Pandas library for data analysis.

0
0.0%
1.1K
total stars
#780
scratchdata/scratchdata

A Swiss army knife for big data, enabling seamless integration with popular data warehousing solutions.

0
0.0%
1.1K
total stars
#781
apache/amoro

Apache Amoro is an open-source Lakehouse management system built on big data formats like Flink, Hudi, and Iceberg.

0
0.0%
1.1K
total stars
#782
youngwookim/awesome-hadoop

A curated list of resources for the Hadoop ecosystem, not a developer discovery platform focused on vibe coders.

0
0.0%
1.1K
total stars
#783
Teradata/kylo

Kylo is an enterprise-grade data lake management platform built on big data technologies like Spark and Hadoop.

0
0.0%
1.1K
total stars
#784
qri-io/qri

An open-source platform for building and sharing datasets, focused on trust, privacy, and decentralization.

0
0.0%
1.1K
total stars
#785
red-data-tools/pycall.rb

A library for calling Python functions from the Ruby language, enabling data science and ML workflows.

0
0.0%
1.1K
total stars
#786
moby/datakit

Connect processes into powerful data pipelines with a simple git-like filesystem interface

0
0.0%
1.1K
total stars
#787
OvertureMaps/data

Overture Maps Data is a Python library providing access to open-source geographic data.

0
0.0%
1.1K
total stars
#788
shaypal5/awesome-twitter-data

A curated list of Twitter datasets and resources for data scientists and social network analysts.

0
0.0%
1.1K
total stars
#789
mycelial/mycelite

Mycelite is a SQLite extension that enables replication between SQLite instances.

0
0.0%
1.1K
total stars
#790
paulmach/orb

A Go library with types and utilities for working with 2D geometry, geospatial data, and mapping.

0
0.0%
1.1K
total stars
#791
brettkromkamp/contextualise

Contextualise is a powerful tool for organizing diverse information resources in knowledge-intensive projects.

0
0.0%
1.1K
total stars
#792
tangwz/db-monthly

A collection of monthly reports on the internals of Alibaba Cloud's database products.

0
0.0%
1.1K
total stars
#793
caserec/Datasets-for-Recommender-Systems

A high-quality dataset repository for building recommender systems, useful for vibe coders working on AI-powered applications.

0
0.0%
1.1K
total stars
#794
dataquestio/project-walkthroughs

A collection of data science, machine learning, and web development project code for Dataquest's YouTube channel.

0
0.0%
1.1K
total stars
#795
traildb/traildb

TrailDB is an efficient database for storing and querying series of events.

0
0.0%
1.1K
total stars
#796
fraunhoferportugal/tsfel

An intuitive library to extract features from time series data for data science and machine learning.

0
0.0%
1.1K
total stars
#797
mahmoudparsian/data-algorithms-book

This repository provides a comprehensive guide and implementations for data algorithms using MapReduce, Spark, Java, and Scala.

0
0.0%
1.1K
total stars
#798
liucongg/NLPDataSet

A repository containing various NLP datasets collected and organized by the owner.

0
0.0%
1.1K
total stars
#799
mpmath/mpmath

A Python library for arbitrary-precision floating-point arithmetic, providing advanced numerical capabilities.

0
0.0%
1.1K
total stars
#800
joaoh82/rust_sqlite

A simple embedded database library in Rust modeled after SQLite, useful for Rust projects.

0
0.0%
1.1K
total stars
1...151718

Stay in the loop

Get weekly updates on trending AI coding tools and projects.