Trending Projects

Discover the fastest growing open source projects

Showing 701-750 of 897 trending projects

#701

mahmoudparsian/pyspark-tutorial

PySpark-Tutorial provides basic algorithms using PySpark for big data analytics and data processing.

+0.1%

1.3K

total stars

Jupyter Notebook

#702

apache/impala

Apache Impala is a high-performance, open-source, SQL query engine that runs on Apache Hadoop and Apache Kudu.

+0.1%

1.3K

total stars

C++

#703

percona/percona-server

Percona Server is an enhanced, open-source version of the MySQL database management system.

+0.1%

1.3K

total stars

C++

#704

wannesm/dtaidistance

A fast C-based implementation of Dynamic Time Warping, a popular algorithm for comparing time series data.

+0.1%

1.2K

total stars

Python

#705

juliasilge/tidytext

A library for text mining and natural language processing using tidy data principles in R.

+0.1%

1.2K

total stars

#706

sryza/spark-timeseries

A library for time series analysis on Apache Spark, enabling efficient large-scale time series processing.

+0.1%

1.2K

total stars

Scala

#707

8080labs/ppscore

A Python library that provides a Predictive Power Score (PPS) to measure the predictive power between variables.

+0.1%

1.2K

total stars

Python

#708

RxSwiftCommunity/RxRealm

A Swift extension for RealmSwift that provides reactive programming support using RxSwift.

+0.1%

1.2K

total stars

Swift

#709

golang/leveldb

The LevelDB key-value database in the Go programming language.

+0.1%

1.2K

total stars

#710

MarcosMeli/FileHelpers

A free and easy-to-use .NET library for reading and writing CSV and fixed-length data files.

+0.1%

1.2K

total stars

#711

GeospatialPython/pyshp

A pure Python library for reading and writing ESRI Shapefiles, a popular geospatial data format.

+0.1%

1.1K

total stars

Python

#712

samayo/country-json

A simple JSON data set of country information, useful for building apps that need country data.

+0.1%

1.1K

total stars

JavaScript

#713

brettkromkamp/contextualise

Contextualise is a powerful tool for organizing diverse information resources in knowledge-intensive projects.

+0.1%

1.1K

total stars

Python

#714

big-data-europe/docker-hive

This is a Docker container for running Apache Hive, a data warehousing tool for big data analysis.

+0.1%

1.1K

total stars

Shell

#715

rhiever/datacleaner

A Python tool that automatically cleans and preprocesses data for analysis and machine learning.

+0.1%

1.1K

total stars

Python

#716

marcboeker/go-duckdb

A Go database/sql driver for the DuckDB database engine, enabling fast and efficient data processing.

+0.1%

1.1K

total stars

#717

jorgecarleitao/arrow2

A Rust library to work with the Arrow data format, without requiring the Transmute crate.

+0.1%

1.1K

total stars

Rust

#718

RedisTimeSeries/RedisTimeSeries

A Redis module that provides a time series data structure for storing and querying time series data.

+0.1%

1.1K

total stars

#719

paulyoder/LinqToExcel

A library that allows developers to use LINQ to retrieve data from spreadsheets and CSV files.

+0.1%

1.1K

total stars

#720

KeithGalli/pandas

A Python library for data manipulation and analysis, part of the core data science toolkit.

+0.1%

1.1K

total stars

Jupyter Notebook

#721

databricks/spark-csv

CSV Data Source for Apache Spark 1.x, a Scala library for working with structured data.

+0.1%

1.1K

total stars

Scala

#722

apache/phoenix

Apache Phoenix is a scalable, distributed SQL engine that connects to HBase for low-latency queries.

+0.1%

1.1K

total stars

Java

#723

J535D165/recordlinkage

A powerful Python library for record linkage and duplicate detection in data-driven applications.

+0.1%

1.0K

total stars

Python

#724

cuge1995/awesome-time-series

A curated list of resources for time series forecasting, including papers, code, and other materials.

+0.1%

1.0K

total stars

#725

avehtari/BDA_py_demos

Provides Bayesian data analysis demos in Python for developers interested in probabilistic modeling.

+0.1%

1.0K

total stars

Jupyter Notebook

#726

axiomhq/hyperloglog

HyperLogLog data structure library with space-efficient sparse and LogLog-Beta implementations.

+0.1%

1.0K

total stars

#727

topling/toplingdb

ToplingDB is a cloud-native, distributed, and searchable key-value store built on RocksDB.

+0.1%

1.0K

total stars

C++

#728

blaze/odo

A Python library for data migration and transformation in the Blaze project.

+0.1%

1.0K

total stars

Python

#729

apache/zeppelin

Zeppelin is a web-based notebook that enables data-driven, interactive data analytics and collaborative documents.

0.0%

6.6K

total stars

Java

#730

BrambleXu/pydata-notebook

A collection of Jupyter Notebook files for data analysis using Python, including a Chinese translation of the popular 'Python for Data Analysis' book.

0.0%

4.7K

total stars

Jupyter Notebook

#731

isar/isar

Extremely fast, easy to use, and fully async NoSQL database for Flutter apps

0.0%

4.0K

total stars

Dart

#732

spark-notebook/spark-notebook

An interactive and reactive data science platform powered by Scala and Apache Spark.

0.0%

3.2K

total stars

JavaScript

#733

man-group/arctic

A high-performance datastore for time series and tick data built on top of MongoDB.

0.0%

3.1K

total stars

Python

#734

quantopian/qgrid

An interactive grid for sorting, filtering, and editing DataFrames in Jupyter notebooks

0.0%

3.1K

total stars

Python

#735

EntilZha/PyFunctional

A Python library for creating data processing pipelines using functional programming principles.

0.0%

2.5K

total stars

Python

#736

PizzaDeDados/datascience-pizza

A repository for collecting study materials and resources related to data analysis and related fields.

0.0%

2.4K

total stars

#737

orbitjs/orbit

A composable data framework for building ambitious web applications using TypeScript.

0.0%

2.3K

total stars

TypeScript

#738

BlankerL/DXY-COVID-19-Data

A data warehouse for COVID-19 time series data, useful for data analysis and visualization.

0.0%

2.2K

total stars

Python

#739

chris1610/pbpython

A collection of Python code, notebooks, and examples for practical business data analysis and visualization.

0.0%

2.0K

total stars

Jupyter Notebook

#740

enhancedformysql/The-Art-of-Problem-Solving-in-Software-Engineering_How-to-Make-MySQL-Better

This repository provides a comprehensive guide on optimizing MySQL performance and solving common database problems.

0.0%

1.9K

total stars

#741

yougov/mongo-connector

MongoDB data stream pipeline tools for managing real-time data synchronization and replication.

0.0%

1.9K

total stars

Python

#742

neil3d/excel2json

A C# library that converts Excel spreadsheets to JSON objects and saves them to a text file.

0.0%

1.9K

total stars

#743

dask/dask-tutorial

An interactive tutorial for the Dask distributed computing library, focused on data analysis and manipulation.

0.0%

1.9K

total stars

Jupyter Notebook

#744

citusdata/cstore_fdw

A columnar storage extension for Postgres built as a foreign data wrapper.

0.0%

1.8K

total stars

#745

tidyverse/tidyverse

A collection of R packages for data science, including tools for data manipulation, visualization, and modeling.

0.0%

1.8K

total stars

#746

variety/variety

A MongoDB schema analysis tool that helps developers understand and optimize their NoSQL database.

0.0%

1.8K

total stars

JavaScript

#747

crossfilter/crossfilter

Fast n-dimensional filtering and grouping of records, a powerful data manipulation library for JavaScript.

0.0%

1.8K

total stars

JavaScript

#748

cmu-db/noisepage

Self-Driving Database Management System from Carnegie Mellon University

0.0%

1.8K

total stars

C++

#749

Tencent/paxosstore

PaxosStore is a high-performance, distributed database solution built for large-scale applications.

0.0%

1.7K

total stars

C++

#750

influxdata/influxdb-python

A Python client library for interacting with the InfluxDB time-series database.

0.0%

1.7K

total stars

Python

1...1416...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.