On Tuesday 17 July 2007, Cheryl Chase wrote:
I would like to create a static html archive of my drupal site (on a
regular basis, for corporate compliance reasons related to
documenting previous states of the publicly published information).
I have referred to the drupal manual page at http://drupal.org/node/
27882. But I have two problems (because I am simply archiving a
snapshot of the site, not retiring it, so I don't want to interfere
with the normal, drupal-based, dynamic operation).
1. When I use any of the download tools (wget, sitesucker, I've not
yet tried httrack) to download the site, the function
drupal_get_html_head in file includes/common.inc outputs a directive
". This causes the downloaded pages to contain links which try to
open the original site on the Internet, rather than the local file copy.
I'm afraid the directive didn't come through. Which one do you mean? :-)
2. I would like to be logged in as a special user, named "archive",
which is configured specially for archive purposes. For instance, it
has no permission to search; it displays a custom block that tells
that this is an archived version of the website, and states the date
on which it was archived, etc. This works for me when I manually
login as user "archive". How to get a downloading tool to login as a
drupal user (they are good at http authentication, but have no
understanding of drupal authentication).
If you log in as that user, you can check in the database and see what the
session variable is for that user. wget (and probably the others) can be set
to send a specific cookie with each request, and you can just give it that
value. See the man page for the exact syntax, as I don't recall it at the
moment.