Trending Projects

Discover the fastest growing open source projects

Showing 801-850 of 897 trending projects

#801
rsvp/fecon235

Notebooks for financial economics, including analyses of Federal Reserve, GDP, inflation, and more.

+9
+0.7%
1.3K
total stars
#802
2ndQuadrant/pglogical

A high-performance logical replication extension for PostgreSQL that enables fast, cross-version database replication.

+9
+0.8%
1.2K
total stars
#803
bigdatagenomics/adam

ADAM is a genomics analysis platform with specialized file formats built using Apache Spark and Apache Parquet.

+9
+0.9%
1.0K
total stars
#804
johannfaouzi/pyts

A Python package for time series classification, useful for developers working with time-series data.

+8
+0.4%
1.9K
total stars
#805
pyjanitor-devs/pyjanitor

A Python library for cleaning and transforming data, inspired by the R package Janitor.

+8
+0.5%
1.5K
total stars
#806
cloudera/hue

Open source SQL query assistant service for databases and data warehouses

+8
+0.6%
1.4K
total stars
#807
sajal2692/data-science-portfolio

A portfolio of data science projects covering machine learning, NLP, and more for personal and academic use.

+8
+0.7%
1.2K
total stars
#808
facebook/mysql-5.6

This is Facebook's branch of the Oracle MySQL database, including the MyRocks storage engine.

+7
+0.3%
2.6K
total stars
#809
benedekrozemberczki/awesome-community-detection

A curated list of community detection research papers with implementations for data science and network analysis.

+7
+0.3%
2.4K
total stars
#810
neo4j-contrib/neo4j-apoc-procedures

A collection of procedures for the Neo4j graph database, providing advanced graph algorithms and utilities.

+7
+0.4%
1.9K
total stars
#811
tylertreat/BoomFilters

Performant probabilistic data structures for processing continuous, unbounded streams in Go.

+7
+0.4%
1.6K
total stars
#812
oracle-samples/oracle-db-examples

This repository provides code examples for Oracle's AI-enabled database features and integrations.

+7
+0.5%
1.4K
total stars
#813
jrfiedler/causal_inference_python_code

Python code for causal inference, a book by Miguel Hernán and James Robins.

+7
+0.5%
1.3K
total stars
#814
petl-developers/petl

A Python library for extracting, transforming, and loading tabular data.

+7
+0.5%
1.3K
total stars
#815
apache/impala

Apache Impala is a high-performance, open-source, SQL query engine that runs on Apache Hadoop and Apache Kudu.

+7
+0.6%
1.3K
total stars
#816
andrewgbruce/statistics-for-data-scientists

This repository provides code and data for a book on statistics for data scientists.

+7
+0.6%
1.2K
total stars
#817
oetiker/rrdtool-1.x

RRDtool is a time-series database system for efficiently storing and graphing data.

+7
+0.7%
1.1K
total stars
#818
twosigma/flint

A time series library for Apache Spark that provides a high-level API for working with time series data.

+7
+0.7%
1.0K
total stars
#819
apache/cassandra

Apache Cassandra is a distributed, wide-column store database system designed for high availability, scalability, and performance.

+6
+0.1%
9.6K
total stars
#820
rosedblabs/rosedb

Lightweight, fast, and reliable key-value database engine in Go for high-throughput applications.

+6
+0.1%
4.9K
total stars
#821
hardikkamboj/An-Introduction-to-Statistical-Learning

This repository provides Python implementations of exercises from the book 'An Introduction to Statistical Learning'.

+6
+0.2%
2.5K
total stars
#822
cnosdb/cnosdb

A high-performance, highly available, and distributed time series database written in Rust.

+6
+0.3%
1.7K
total stars
#823
mongodb/mongo-hadoop

A Java connector for integrating MongoDB with Hadoop ecosystems for big data processing.

+6
+0.4%
1.6K
total stars
#824
PKUJohnson/OpenData

An open-source financial data extraction tool that allows easy API access to web scrape data from various websites.

+6
+0.4%
1.4K
total stars
#825
wireservice/agate

A Python data analysis library optimized for humans instead of machines.

+6
+0.5%
1.2K
total stars
#826
apache/accumulo

Apache Accumulo is a scalable and robust key-value store that provides a sparse, sorted, distributed, and persistent multi-dimensional table.

+6
+0.5%
1.1K
total stars
#827
sripathikrishnan/redis-rdb-tools

A Python tool to parse Redis dump.rdb files, analyze memory usage, and export data to JSON.

+5
+0.1%
5.2K
total stars
#828
orbitinghail/sqlsync

Collaborative offline-first SQLite wrapper for syncing app state across users & devices

+5
+0.2%
2.9K
total stars
#829
hi-primus/optimus

Agile data preparation workflows made easy with popular Python data science libraries.

+5
+0.3%
1.5K
total stars
#830
FirebirdSQL/firebird

Firebird is a relational database management system (RDBMS) suitable for a wide range of applications from desktop to client-server to large databases.

+5
+0.4%
1.4K
total stars
#831
PumpkinDB/PumpkinDB

PumpkinDB is an immutable, ordered key-value database engine written in Rust.

+5
+0.4%
1.4K
total stars
#832
alan-turing-institute/CleverCSV

A Python package for handling messy CSV files with improved dialect detection and a command-line interface.

+5
+0.4%
1.3K
total stars
#833
mahmoudparsian/pyspark-tutorial

PySpark-Tutorial provides basic algorithms using PySpark for big data analytics and data processing.

+5
+0.4%
1.3K
total stars
#834
syndtr/goleveldb

LevelDB key/value database in Go for building high-performance data-intensive applications.

+4
+0.1%
6.3K
total stars
#835
airbnb/knowledge-repo

A next-generation curated knowledge sharing platform for data scientists and other technical professionals.

+4
+0.1%
5.5K
total stars
#836
jeremyevans/sequel

Sequel is a Ruby library that provides a powerful and flexible object-relational mapping (ORM) for databases.

+4
+0.1%
5.1K
total stars
#837
Cyb3rWard0g/HELK

An open-source threat hunting platform built on the ELK stack for security researchers and analysts.

+4
+0.1%
3.9K
total stars
#838
schematics/schematics

Python data structures library focused on serialization, deserialization, and validation of complex data schemas.

+4
+0.1%
2.6K
total stars
#839
tangwz/db-monthly

A collection of monthly reports on the internals of Alibaba Cloud's database products.

+4
+0.4%
1.1K
total stars
#840
topling/toplingdb

ToplingDB is a cloud-native, distributed, and searchable key-value store built on RocksDB.

+4
+0.4%
1.0K
total stars
#841
WeBankFinTech/DataSphereStudio

DataSphereStudio is a one-stop data application development and management portal covering data exchange, analysis, and visualization.

+3
+0.1%
3.3K
total stars
#842
griddb/griddb

GridDB is a fast and scalable open-source database for time-series IoT and big data applications.

+3
+0.1%
2.5K
total stars
#843
jstat/jstat

A JavaScript statistical library that provides a wide range of statistical functions for data analysis.

+3
+0.2%
1.8K
total stars
#844
abhishek-ch/around-dataengineering

A comprehensive knowledge hub for data engineering, machine learning, and MLOps tools and practices.

+3
+0.3%
1.1K
total stars
#845
youngyangyang04/Skiplist-CPP

A lightweight key-value store built with C++ using a skiplist data structure.

+2
+0.1%
2.4K
total stars
#846
orbitjs/orbit

A composable data framework for building ambitious web applications using TypeScript.

+2
+0.1%
2.3K
total stars
#847
typelevel/skunk

A functional, type-safe, composable Scala data access library for Postgres databases.

+2
+0.1%
1.6K
total stars
#848
aergoio/litetree

SQLite with Branches - a lightweight, embedded database with version control capabilities.

+2
+0.1%
1.6K
total stars
#849
gobuffalo/pop

A Go ORM and query builder for interacting with databases in Go applications.

+2
+0.1%
1.5K
total stars
#850
GeostatsGuy/PythonNumericalDemos

Python demos for spatial data analytics, geostatistics, and machine learning to support courses.

+2
+0.1%
1.5K
total stars
1...1618

Stay in the loop

Get weekly updates on trending AI coding tools and projects.