Explore Projects

Discover 5 open source projects

Active filters (1):
Search: warcร—
Clear all

Showing 1-5 of 5 projects

ArchiveBox/ArchiveBox

Self-hosted web archiving tool for preserving URLs, bookmarks, and media.

27.0K
Active
Python
CLI Tools
Containerization
Docker
#archivebox#backups#web-archiving

internetarchive/heritrix3

Heritrix is an open-source, extensible web crawler for archiving websites at scale.

3.2K
Active
Java
Backend Frameworks
ETL & Pipelines
#web-crawling#warc#java

ArchiveTeam/grab-site

A web crawler tool that outputs WARC files and provides a dashboard for managing crawls.

1.6K
Experimental
Python
CLI Tools
Backend Frameworks
Python
#archiving#crawling#web-scraping

Rhizome-Conifer/conifer

Collects and revisits web pages using Python.

1.5K
Active
Python
React
#web-archiving#warc#wayback

webrecorder/archiveweb.page

A high-fidelity web archiving extension for Chrome and Chromium-based browsers, built with TypeScript.

1.4K
Active
TypeScript
Frontend Frameworks
CLI Tools
Chromium
#web-archiving#browser-extension#chromium

Stay in the loop

Get weekly updates on trending AI coding tools and projects.