Simple Page Archive 1.3 review
DownloadSimple Page Archive is a mirror and archiving tool to copy Web pages you are interested in
|
|
Simple Page Archive is a mirror and archiving tool to copy Web pages you are interested in. The CGI script downloads all images and CSS files to preserve the mirrored Web page.
It works with the ZEUS (www.zeus.com) and Apache (www.apache.org) web servers. SPA is an simple CGI script which allows you to mirror a single web page. It stores all images and CSSs locally, so you are able to browse through the archive without the need of the original, images being availiable.
The script is dead simple to install!
1. First you need to download "Beatiful Soup" (BS) from http://www.crummy.com/software/BeautifulSoup/ which is a quite simple but very good HTML Parser (not like the one in the Python distro .. which is acutally broken). Please "install" the BS module in your site-packages directory of python.
2. Copy the "index.py" file to directory of your "web archive".
3. Edit the script and change wroot variable in Configuration section at the beginning of the script to the document root directory of your web archive (NOT the physical path on the disk!)
3.1 If you are behind a firewall and you need proxy support, add your proxy server in the Configuration section as well.
4. Make sure you have CGI support enabled in your web server.
5. Make sure index.py is being called as the default DirectoryIndex.
6. Make sure the permissions of the index.py file and the directory are set
correctly. The CGI process must be able to write to your archive directory.
7. Open a browser and try to mirror a page ;-)
What's New in This Release:
Added filter support
Output now sorted by date
Simple Page Archive 1.3 search tags