Trending Projects

Discover the fastest growing open source projects

Showing 701-750 of 897 trending projects

#701
mahmoudparsian/pyspark-tutorial

PySpark-Tutorial provides basic algorithms using PySpark for big data analytics and data processing.

+1
+0.1%
1.3K
total stars
#702
apache/impala

Apache Impala is a high-performance, open-source, SQL query engine that runs on Apache Hadoop and Apache Kudu.

+1
+0.1%
1.3K
total stars
#703
percona/percona-server

Percona Server is an enhanced, open-source version of the MySQL database management system.

+1
+0.1%
1.3K
total stars
#704
wannesm/dtaidistance

A fast C-based implementation of Dynamic Time Warping, a popular algorithm for comparing time series data.

+1
+0.1%
1.2K
total stars
#705
juliasilge/tidytext

A library for text mining and natural language processing using tidy data principles in R.

+1
+0.1%
1.2K
total stars
#706
sryza/spark-timeseries

A library for time series analysis on Apache Spark, enabling efficient large-scale time series processing.

+1
+0.1%
1.2K
total stars
#707
8080labs/ppscore

A Python library that provides a Predictive Power Score (PPS) to measure the predictive power between variables.

+1
+0.1%
1.2K
total stars
#708
RxSwiftCommunity/RxRealm

A Swift extension for RealmSwift that provides reactive programming support using RxSwift.

+1
+0.1%
1.2K
total stars
#709
golang/leveldb

The LevelDB key-value database in the Go programming language.

+1
+0.1%
1.2K
total stars
#710
MarcosMeli/FileHelpers

A free and easy-to-use .NET library for reading and writing CSV and fixed-length data files.

+1
+0.1%
1.2K
total stars
#711
GeospatialPython/pyshp

A pure Python library for reading and writing ESRI Shapefiles, a popular geospatial data format.

+1
+0.1%
1.1K
total stars
#712
samayo/country-json

A simple JSON data set of country information, useful for building apps that need country data.

+1
+0.1%
1.1K
total stars
#713
brettkromkamp/contextualise

Contextualise is a powerful tool for organizing diverse information resources in knowledge-intensive projects.

+1
+0.1%
1.1K
total stars
#714
big-data-europe/docker-hive

This is a Docker container for running Apache Hive, a data warehousing tool for big data analysis.

+1
+0.1%
1.1K
total stars
#715
rhiever/datacleaner

A Python tool that automatically cleans and preprocesses data for analysis and machine learning.

+1
+0.1%
1.1K
total stars
#716
marcboeker/go-duckdb

A Go database/sql driver for the DuckDB database engine, enabling fast and efficient data processing.

+1
+0.1%
1.1K
total stars
#717
jorgecarleitao/arrow2

A Rust library to work with the Arrow data format, without requiring the Transmute crate.

+1
+0.1%
1.1K
total stars
#718
RedisTimeSeries/RedisTimeSeries

A Redis module that provides a time series data structure for storing and querying time series data.

+1
+0.1%
1.1K
total stars
#719
paulyoder/LinqToExcel

A library that allows developers to use LINQ to retrieve data from spreadsheets and CSV files.

+1
+0.1%
1.1K
total stars
#720
KeithGalli/pandas

A Python library for data manipulation and analysis, part of the core data science toolkit.

+1
+0.1%
1.1K
total stars
#721
databricks/spark-csv

CSV Data Source for Apache Spark 1.x, a Scala library for working with structured data.

+1
+0.1%
1.1K
total stars
#722
apache/phoenix

Apache Phoenix is a scalable, distributed SQL engine that connects to HBase for low-latency queries.

+1
+0.1%
1.1K
total stars
#723
J535D165/recordlinkage

A powerful Python library for record linkage and duplicate detection in data-driven applications.

+1
+0.1%
1.0K
total stars
#724
cuge1995/awesome-time-series

A curated list of resources for time series forecasting, including papers, code, and other materials.

+1
+0.1%
1.0K
total stars
#725
avehtari/BDA_py_demos

Provides Bayesian data analysis demos in Python for developers interested in probabilistic modeling.

+1
+0.1%
1.0K
total stars
#726
axiomhq/hyperloglog

HyperLogLog data structure library with space-efficient sparse and LogLog-Beta implementations.

+1
+0.1%
1.0K
total stars
#727
topling/toplingdb

ToplingDB is a cloud-native, distributed, and searchable key-value store built on RocksDB.

+1
+0.1%
1.0K
total stars
#728
blaze/odo

A Python library for data migration and transformation in the Blaze project.

+1
+0.1%
1.0K
total stars
#729
apache/zeppelin

Zeppelin is a web-based notebook that enables data-driven, interactive data analytics and collaborative documents.

0
0.0%
6.6K
total stars
#730
BrambleXu/pydata-notebook

A collection of Jupyter Notebook files for data analysis using Python, including a Chinese translation of the popular 'Python for Data Analysis' book.

0
0.0%
4.7K
total stars
#731
isar/isar

Extremely fast, easy to use, and fully async NoSQL database for Flutter apps

0
0.0%
4.0K
total stars
#732
spark-notebook/spark-notebook

An interactive and reactive data science platform powered by Scala and Apache Spark.

0
0.0%
3.2K
total stars
#733
man-group/arctic

A high-performance datastore for time series and tick data built on top of MongoDB.

0
0.0%
3.1K
total stars
#734
quantopian/qgrid

An interactive grid for sorting, filtering, and editing DataFrames in Jupyter notebooks

0
0.0%
3.1K
total stars
#735
EntilZha/PyFunctional

A Python library for creating data processing pipelines using functional programming principles.

0
0.0%
2.5K
total stars
#736
PizzaDeDados/datascience-pizza

A repository for collecting study materials and resources related to data analysis and related fields.

0
0.0%
2.4K
total stars
#737
orbitjs/orbit

A composable data framework for building ambitious web applications using TypeScript.

0
0.0%
2.3K
total stars
#738
BlankerL/DXY-COVID-19-Data

A data warehouse for COVID-19 time series data, useful for data analysis and visualization.

0
0.0%
2.2K
total stars
#739
chris1610/pbpython

A collection of Python code, notebooks, and examples for practical business data analysis and visualization.

0
0.0%
2.0K
total stars
#740
enhancedformysql/The-Art-of-Problem-Solving-in-Software-Engineering_How-to-Make-MySQL-Better

This repository provides a comprehensive guide on optimizing MySQL performance and solving common database problems.

0
0.0%
1.9K
total stars
#741
yougov/mongo-connector

MongoDB data stream pipeline tools for managing real-time data synchronization and replication.

0
0.0%
1.9K
total stars
#742
neil3d/excel2json

A C# library that converts Excel spreadsheets to JSON objects and saves them to a text file.

0
0.0%
1.9K
total stars
#743
dask/dask-tutorial

An interactive tutorial for the Dask distributed computing library, focused on data analysis and manipulation.

0
0.0%
1.9K
total stars
#744
citusdata/cstore_fdw

A columnar storage extension for Postgres built as a foreign data wrapper.

0
0.0%
1.8K
total stars
#745
tidyverse/tidyverse

A collection of R packages for data science, including tools for data manipulation, visualization, and modeling.

0
0.0%
1.8K
total stars
#746
variety/variety

A MongoDB schema analysis tool that helps developers understand and optimize their NoSQL database.

0
0.0%
1.8K
total stars
#747
crossfilter/crossfilter

Fast n-dimensional filtering and grouping of records, a powerful data manipulation library for JavaScript.

0
0.0%
1.8K
total stars
#748
cmu-db/noisepage

Self-Driving Database Management System from Carnegie Mellon University

0
0.0%
1.8K
total stars
#749
Tencent/paxosstore

PaxosStore is a high-performance, distributed database solution built for large-scale applications.

0
0.0%
1.7K
total stars
#750
influxdata/influxdb-python

A Python client library for interacting with the InfluxDB time-series database.

0
0.0%
1.7K
total stars
1...1416...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.