allenai/dolma

A Python library and tools for generating and inspecting data for pre-training large language models (LLMs).

Python
AI & Machine Learning
LLM Frameworks
Apache-2.0

1.4K

Stars

167

Forks

Jun 20, 2023

Created

Nov 5, 2025

Last Updated

Project Analytics

Stars Growth (1 Month)

+19

+1.4% change

Avg Daily Growth (1 Month)

+0.7

stars per day

Fork/Star Ratio (All Time)

11.7%

Good engagement

Lifetime Growth

1.4

stars/day over 990 days

Stars Over Time

Forks Over Time

Open Issues Over Time

Pull Requests Over Time

Commits Over Time

AI-Generated Tags

large-language-models
data-processing
natural-language-processing
machine-learning

Comments (0)

Sign in to leave a comment or vote

Sign In

No comments yet. Be the first to comment!

Stay in the loop

Get weekly updates on trending AI coding tools and projects.