huggingface/datatrove

A Python library that provides a set of customizable pipeline processing blocks for data processing tasks.

Python
Data & Databases
ETL & Pipelines
Apache-2.0

2.9K

Stars

247

Forks

Jun 14, 2023

Created

Mar 4, 2026

Last Updated

Project Analytics

Stars Growth (1 Month)

+45

+1.6% change

Avg Daily Growth (1 Month)

+1.6

stars per day

Fork/Star Ratio (All Time)

8.5%

Normal engagement

Lifetime Growth

2.9

stars/day over 997 days

Stars Over Time

Forks Over Time

Open Issues Over Time

Pull Requests Over Time

Commits Over Time

AI-Generated Tags

data-processing
pipeline
customizable
cli
etl

Comments (0)

Sign in to leave a comment or vote

Sign In

No comments yet. Be the first to comment!

Stay in the loop

Get weekly updates on trending AI coding tools and projects.