Trending Projects

Discover the fastest growing open source projects

Showing 801-850 of 897 trending projects

#801

rsvp/fecon235

Notebooks for financial economics, including analyses of Federal Reserve, GDP, inflation, and more.

+0.7%

1.3K

total stars

Jupyter Notebook

#802

2ndQuadrant/pglogical

A high-performance logical replication extension for PostgreSQL that enables fast, cross-version database replication.

+0.8%

1.2K

total stars

#803

bigdatagenomics/adam

ADAM is a genomics analysis platform with specialized file formats built using Apache Spark and Apache Parquet.

+0.9%

1.0K

total stars

Scala

#804

johannfaouzi/pyts

A Python package for time series classification, useful for developers working with time-series data.

+0.4%

1.9K

total stars

Python

#805

pyjanitor-devs/pyjanitor

A Python library for cleaning and transforming data, inspired by the R package Janitor.

+0.5%

1.5K

total stars

Python

#806

cloudera/hue

Open source SQL query assistant service for databases and data warehouses

+0.6%

1.4K

total stars

JavaScript

#807

sajal2692/data-science-portfolio

A portfolio of data science projects covering machine learning, NLP, and more for personal and academic use.

+0.7%

1.2K

total stars

Jupyter Notebook

#808

facebook/mysql-5.6

This is Facebook's branch of the Oracle MySQL database, including the MyRocks storage engine.

+0.3%

2.6K

total stars

C++

#809

benedekrozemberczki/awesome-community-detection

A curated list of community detection research papers with implementations for data science and network analysis.

+0.3%

2.4K

total stars

Python

#810

neo4j-contrib/neo4j-apoc-procedures

A collection of procedures for the Neo4j graph database, providing advanced graph algorithms and utilities.

+0.4%

1.9K

total stars

Java

#811

tylertreat/BoomFilters

Performant probabilistic data structures for processing continuous, unbounded streams in Go.

+0.4%

1.6K

total stars

#812

oracle-samples/oracle-db-examples

This repository provides code examples for Oracle's AI-enabled database features and integrations.

+0.5%

1.4K

total stars

Java

#813

jrfiedler/causal_inference_python_code

Python code for causal inference, a book by Miguel Hernán and James Robins.

+0.5%

1.3K

total stars

Jupyter Notebook

#814

petl-developers/petl

A Python library for extracting, transforming, and loading tabular data.

+0.5%

1.3K

total stars

Python

#815

apache/impala

Apache Impala is a high-performance, open-source, SQL query engine that runs on Apache Hadoop and Apache Kudu.

+0.6%

1.3K

total stars

C++

#816

andrewgbruce/statistics-for-data-scientists

This repository provides code and data for a book on statistics for data scientists.

+0.6%

1.2K

total stars

#817

oetiker/rrdtool-1.x

RRDtool is a time-series database system for efficiently storing and graphing data.

+0.7%

1.1K

total stars

#818

twosigma/flint

A time series library for Apache Spark that provides a high-level API for working with time series data.

+0.7%

1.0K

total stars

Scala

#819

apache/cassandra

Apache Cassandra is a distributed, wide-column store database system designed for high availability, scalability, and performance.

+0.1%

9.6K

total stars

Java

#820

rosedblabs/rosedb

Lightweight, fast, and reliable key-value database engine in Go for high-throughput applications.

+0.1%

4.9K

total stars

#821

hardikkamboj/An-Introduction-to-Statistical-Learning

This repository provides Python implementations of exercises from the book 'An Introduction to Statistical Learning'.

+0.2%

2.5K

total stars

Jupyter Notebook

#822

cnosdb/cnosdb

A high-performance, highly available, and distributed time series database written in Rust.

+0.3%

1.7K

total stars

Rust

#823

mongodb/mongo-hadoop

A Java connector for integrating MongoDB with Hadoop ecosystems for big data processing.

+0.4%

1.6K

total stars

Java

#824

PKUJohnson/OpenData

An open-source financial data extraction tool that allows easy API access to web scrape data from various websites.

+0.4%

1.4K

total stars

Python

#825

wireservice/agate

A Python data analysis library optimized for humans instead of machines.

+0.5%

1.2K

total stars

Python

#826

apache/accumulo

Apache Accumulo is a scalable and robust key-value store that provides a sparse, sorted, distributed, and persistent multi-dimensional table.

+0.5%

1.1K

total stars

Java

#827

sripathikrishnan/redis-rdb-tools

A Python tool to parse Redis dump.rdb files, analyze memory usage, and export data to JSON.

+0.1%

5.2K

total stars

Python

#828

orbitinghail/sqlsync

Collaborative offline-first SQLite wrapper for syncing app state across users & devices

+0.2%

2.9K

total stars

Rust

#829

hi-primus/optimus

Agile data preparation workflows made easy with popular Python data science libraries.

+0.3%

1.5K

total stars

Python

#830

FirebirdSQL/firebird

Firebird is a relational database management system (RDBMS) suitable for a wide range of applications from desktop to client-server to large databases.

+0.4%

1.4K

total stars

C++

#831

PumpkinDB/PumpkinDB

PumpkinDB is an immutable, ordered key-value database engine written in Rust.

+0.4%

1.4K

total stars

Rust

#832

alan-turing-institute/CleverCSV

A Python package for handling messy CSV files with improved dialect detection and a command-line interface.

+0.4%

1.3K

total stars

Python

#833

mahmoudparsian/pyspark-tutorial

PySpark-Tutorial provides basic algorithms using PySpark for big data analytics and data processing.

+0.4%

1.3K

total stars

Jupyter Notebook

#834

syndtr/goleveldb

LevelDB key/value database in Go for building high-performance data-intensive applications.

+0.1%

6.3K

total stars

#835

airbnb/knowledge-repo

A next-generation curated knowledge sharing platform for data scientists and other technical professionals.

+0.1%

5.5K

total stars

Python

#836

jeremyevans/sequel

Sequel is a Ruby library that provides a powerful and flexible object-relational mapping (ORM) for databases.

+0.1%

5.1K

total stars

Ruby

#837

Cyb3rWard0g/HELK

An open-source threat hunting platform built on the ELK stack for security researchers and analysts.

+0.1%

3.9K

total stars

Jupyter Notebook

#838

schematics/schematics

Python data structures library focused on serialization, deserialization, and validation of complex data schemas.

+0.1%

2.6K

total stars

Python

#839

tangwz/db-monthly

A collection of monthly reports on the internals of Alibaba Cloud's database products.

+0.4%

1.1K

total stars

#840

topling/toplingdb

ToplingDB is a cloud-native, distributed, and searchable key-value store built on RocksDB.

+0.4%

1.0K

total stars

C++

#841

WeBankFinTech/DataSphereStudio

DataSphereStudio is a one-stop data application development and management portal covering data exchange, analysis, and visualization.

+0.1%

3.3K

total stars

Java

#842

griddb/griddb

GridDB is a fast and scalable open-source database for time-series IoT and big data applications.

+0.1%

2.5K

total stars

C++

#843

jstat/jstat

A JavaScript statistical library that provides a wide range of statistical functions for data analysis.

+0.2%

1.8K

total stars

JavaScript

#844

abhishek-ch/around-dataengineering

A comprehensive knowledge hub for data engineering, machine learning, and MLOps tools and practices.

+0.3%

1.1K

total stars

Python

#845

youngyangyang04/Skiplist-CPP

A lightweight key-value store built with C++ using a skiplist data structure.

+0.1%

2.4K

total stars

C++

#846

orbitjs/orbit

A composable data framework for building ambitious web applications using TypeScript.

+0.1%

2.3K

total stars

TypeScript

#847

typelevel/skunk

A functional, type-safe, composable Scala data access library for Postgres databases.

+0.1%

1.6K

total stars

Scala

#848

aergoio/litetree

SQLite with Branches - a lightweight, embedded database with version control capabilities.

+0.1%

1.6K

total stars

#849

gobuffalo/pop

A Go ORM and query builder for interacting with databases in Go applications.

+0.1%

1.5K

total stars

#850

GeostatsGuy/PythonNumericalDemos

Python demos for spatial data analytics, geostatistics, and machine learning to support courses.

+0.1%

1.5K

total stars

Jupyter Notebook

1...1618

Stay in the loop

Get weekly updates on trending AI coding tools and projects.