Simple Page Archive =================== Written by: Alexander Meisel (spa@meisel.cc) What is it: ----------- The "Simple Page Archive" (SPA) is a mirror and archiving tool to preserve web pages you are interessted in. It is also web based and written in python. It is known to work with the ZEUS (www.zeus.com) and Apache (www.apache.org) web servers. SPA is an simple CGI script which allows you to mirror a single web page. It stores all images and CSSs locally, so you are able to browse through the archive without the need of the orinigal images being availiable. How to install: --------------- The script is dead simple to install! ;-) 1. First you need to download "Beatiful Soup" (BS) from http://www.crummy.com/software/BeautifulSoup/ which is a quite simple but very good HTML Parser (not like the one in the Python distro .. which is acutally broken). Please "install" the BS module in your site-packages directory of python. 2. Copy the "index.py" file to directory of your "web archive". 3. Edit the script and change wroot variable in Configuration section at the beginning of the script to the document root directory of your web archive (NOT the physical path on the disk!) 3.1 If you are behind a firewall and you need proxy support, add your proxy server in the Configuration section as well. 4. Make sure you have CGI support enabled in your web server. 5. Make sure index.py is being called as the default DirectoryIndex. 6. Make sure the permissions of the index.py file and the directory are set correctly. The CGI process must be able to write to your archive directory. 7. Open a browser and try to mirror a page ;-) I'm open for ideas and patches .... otherwise just enjoy the software.