Trending Projects

Discover the fastest growing open source projects

Showing 301-350 of 897 trending projects

#301
RoaringBitmap/RoaringBitmap

A high-performance compressed bitset library for Java used in Apache Spark, Netflix Atlas, and others.

+17
+0.5%
3.8K
total stars
#302
supabase/etl

A real-time Postgres data replication and streaming library built in Rust for building CDC pipelines.

+17
+0.8%
2.2K
total stars
#303
AlexTheAnalyst/PortfolioProjects

This repository contains a collection of portfolio projects for a data analyst, not a developer discovery platform.

+17
+1.2%
1.4K
total stars
#304
erthink/libmdbx

High-performance, transactional key-value database engine for embedded systems and cryptocurrencies.

+17
+1.3%
1.4K
total stars
#305
pentaho/pentaho-kettle

Pentaho Data Integration (ETL) is a Java-based tool for building data integration and ETL pipelines.

+16
+0.2%
8.3K
total stars
#306
wireservice/csvkit

A suite of utilities for converting to and working with CSV, the king of tabular file formats.

+16
+0.3%
6.4K
total stars
#307
fluvio-community/fluvio

Fluvio is an event stream processing engine for developers to build responsive data-intensive apps.

+16
+0.3%
5.2K
total stars
#308
tidyverse/dplyr

dplyr is a powerful R library for data manipulation, providing a grammar of data manipulation.

+16
+0.3%
5.0K
total stars
#309
openaddresses/openaddresses

An open-source global repository of address, building, and parcel data for developers and geospatial applications.

+16
+0.5%
3.1K
total stars
#310
awslabs/open-data-registry

A registry of publicly available datasets hosted on AWS for data-driven developers.

+16
+1.0%
1.7K
total stars
#311
hermitdave/FrequencyWords

A frequency word list generator and processed files for text analysis and natural language processing.

+16
+1.1%
1.5K
total stars
#312
orbitinghail/graft

Graft is an open-source transactional storage engine optimized for lazy, partial, and strongly consistent replication, ideal for edge, offline-first, and distributed applications.

+16
+1.1%
1.4K
total stars
#313
openbabel/openbabel

Open Babel is a chemical toolbox for working with chemical data and cheminformatics.

+16
+1.3%
1.3K
total stars
#314
duckdb/dbt-duckdb

A dbt adapter for the DuckDB database, enabling developers to build data pipelines and models with dbt.

+16
+1.3%
1.2K
total stars
#315
ResidentMario/geoplot

A high-level geospatial data visualization library for Python developers working with spatial data.

+16
+1.4%
1.2K
total stars
#316
egbertbouman/youtube-comment-downloader

Simple script for downloading YouTube comments without using the YouTube API.

+16
+1.4%
1.2K
total stars
#317
mybatis/mybatis-3

MyBatis SQL Mapper for Java simplifies database interactions with object mapping.

+15
+0.1%
20.4K
total stars
#318
zhisheng17/flink-learning

This is a comprehensive learning resource for the Flink stream processing framework, covering concepts, principles, and real-world use cases.

+15
+0.1%
15.1K
total stars
#319
litedb-org/LiteDB

LiteDB is a lightweight, embedded NoSQL document database for .NET applications that can be used in a single data file.

+15
+0.2%
9.4K
total stars
#320
PostgresApp/PostgresApp

An open-source PostgreSQL client application for macOS, providing an easy way to set up and manage a local PostgreSQL database.

+15
+0.2%
7.7K
total stars
#321
wainshine/Chinese-Names-Corpus

A Chinese name corpus and generator for natural language processing and entity recognition.

+15
+0.3%
4.3K
total stars
#322
xerial/sqlite-jdbc

SQLite JDBC Driver - a Java library for accessing SQLite databases

+15
+0.5%
3.2K
total stars
#323
pydata/pandas-datareader

A Python library for extracting data from a wide range of internet sources into a pandas DataFrame.

+15
+0.5%
3.2K
total stars
#324
Visualize-ML/Book6_First-Course-in-Data-Science

A book on data science, covering topics from basic math to machine learning using Python and Jupyter Notebooks.

+15
+0.6%
2.6K
total stars
#325
uhub/awesome-matlab

A curated list of awesome MATLAB frameworks, libraries, and software for scientific computing and data analysis.

+15
+0.9%
1.7K
total stars
#326
lmmentel/awesome-python-chemistry

A curated list of Python packages for chemistry, including computational chemistry, molecular dynamics, and quantum chemistry.

+15
+1.1%
1.4K
total stars
#327
jtv/libpqxx

The official C++ client API for PostgreSQL, providing a high-level interface for interacting with PostgreSQL databases.

+15
+1.2%
1.3K
total stars
#328
uwdata/mosaic

An extensible framework for linking databases and interactive views, focused on scalability and visualization.

+15
+1.2%
1.3K
total stars
#329
scikit-bio/scikit-bio

A versatile Python library for bioinformatics, providing data structures, algorithms, and educational resources.

+15
+1.3%
1.2K
total stars
#330
samapriya/awesome-gee-community-datasets

A community-driven catalog of geospatial datasets for use with Google Earth Engine.

+15
+1.4%
1.1K
total stars
#331
hannorein/rebound

An open-source N-body simulation library for astrophysics and planetary science.

+15
+1.5%
1.0K
total stars
#332
devrimgunduz/pagila

A PostgreSQL sample database for testing and learning SQL queries.

+15
+1.5%
1.0K
total stars
#333
tonsky/datascript

Immutable database and Datalog query engine for Clojure, ClojureScript and JS

+14
+0.3%
5.7K
total stars
#334
electricitymaps/electricitymaps-contrib

An open-source repository for parsing electricity data and powering a comprehensive electricity data platform.

+14
+0.4%
4.0K
total stars
#335
camelot-dev/camelot

A Python library for extracting tabular data from PDF files, useful for data processing and analysis.

+14
+0.4%
3.6K
total stars
#336
jdorfman/awesome-json-datasets

A curated list of awesome JSON datasets that don't require authentication.

+14
+0.4%
3.6K
total stars
#337
linhandev/dataset

A comprehensive index of medical imaging datasets for researchers and developers working in the medical imaging field.

+14
+0.4%
3.5K
total stars
#338
antonycourtney/tad

A desktop application for viewing and analyzing tabular data, with support for CSV, Parquet, and DuckDB.

+14
+0.4%
3.4K
total stars
#339
igrigorik/gharchive.org

An open-source project that captures the public GitHub timeline and makes it accessible for analysis.

+14
+0.5%
3.0K
total stars
#340
apache/incubator-devlake

An open-source dev data platform to ingest, analyze, and visualize data from DevOps tools for engineering insights.

+14
+0.5%
2.9K
total stars
#341
ekzhu/datasketch

A Python library for data sketching techniques like MinHash, LSH, HyperLogLog, and HNSW for approximate similarity search.

+14
+0.5%
2.9K
total stars
#342
mourner/rbush

RBush is a high-performance JavaScript R-tree-based 2D spatial index for points and rectangles.

+14
+0.5%
2.7K
total stars
#343
RoaringBitmap/CRoaring

Optimized Roaring bitmaps in C and C++ with SIMD (AVX2, AVX-512, NEON) for high-performance data processing.

+14
+0.8%
1.8K
total stars
#344
chaisql/chai

A modern, embedded SQL database written in Go for embedded and mobile applications.

+14
+0.8%
1.7K
total stars
#345
substrait-io/substrait

A cross-platform way to express data transformation, relational algebra, and standardized record expression and plans.

+14
+1.0%
1.5K
total stars
#346
cvxgrp/cvxportfolio

A Python library for portfolio optimization and back-testing in finance.

+14
+1.2%
1.2K
total stars
#347
orbitdb/orbitdb

OrbitDB is a peer-to-peer database for the decentralized web, enabling developers to build offline-first, distributed applications.

+13
+0.1%
8.7K
total stars
#348
has2k1/plotnine

A grammar of graphics library for creating highly customizable and publication-quality plots in Python.

+13
+0.3%
4.5K
total stars
#349
Visualize-ML/Book2_Beauty-of-Data-Visualization

A collection of Jupyter Notebook files focused on data visualization and machine learning concepts.

+13
+0.4%
3.6K
total stars
#350
awslabs/deequ

Deequ is a Scala library for defining "unit tests for data" to measure data quality in large datasets.

+13
+0.4%
3.6K
total stars
1...68...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.