internetarchive/heritrix3

Heritrix is an open-source, extensible web crawler for archiving websites at scale.

Java
Web Development
Backend Frameworks
NOASSERTION

3.2K

Stars

779

Forks

Oct 21, 2011

Created

Mar 2, 2026

Last Updated

Project Analytics

Stars Growth (1 Month)

+17

+0.5% change

Avg Daily Growth (1 Month)

+0.6

stars per day

Fork/Star Ratio (All Time)

24.4%

High engagement

Lifetime Growth

0.6

stars/day over 5.3K days

Stars Over Time

Forks Over Time

Open Issues Over Time

Pull Requests Over Time

Commits Over Time

AI-Generated Tags

web-crawling
warc
java
open-source
archiving
batch-processing

Comments (0)

Sign in to leave a comment or vote

Sign In

No comments yet. Be the first to comment!

Stay in the loop

Get weekly updates on trending AI coding tools and projects.