[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#915372: RFP: grab-site — The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns



Package: wnpp
Severity: wishlist


Upstream-Dev: https://github.com/ludios/grab-site/

Programming Language: Python

License: MIT license

Description: grab-site is an easy preconfigured web crawler designed for backing up websites. Give grab-site a URL and it will recursively crawl the site and write WARC files. Internally, grab-site uses a fork of wpull for crawling.
Reply to: