NVIDIA-NeMo/Curator

Scalable data pre processing and curation toolkit for Large Language Models (LLMs)

Python
AI Coding Tools
Apache-2.0

1.4K

Stars

225

Forks

Mar 14, 2024

Created

Mar 5, 2026

Last Updated

Project Analytics

Stars Growth (1 Month)

+41

+3.0% change

Avg Daily Growth (1 Month)

+1.5

stars per day

Fork/Star Ratio (All Time)

15.8%

Good engagement

Lifetime Growth

2.0

stars/day over 722 days

Stars Over Time

Forks Over Time

Open Issues Over Time

Pull Requests Over Time

Commits Over Time

AI-Generated Tags

data-curation
large-language-models
data-preparation
deduplication

Comments (0)

Sign in to leave a comment or vote

Sign In

No comments yet. Be the first to comment!

Stay in the loop

Get weekly updates on trending AI coding tools and projects.