Re: [support] support Digest, Vol 55, Issue 25
On Jul 18, 2007, at 5:00 AM, support-request@drupal.org wrote:
On Tuesday 17 July 2007, Cheryl Chase wrote:
I would like to create a static html archive of my drupal site (on a regular basis, for corporate compliance reasons related to documenting previous states of the publicly published information).
I have referred to the drupal manual page at http://drupal.org/node/ 27882. But I have two problems (because I am simply archiving a snapshot of the site, not retiring it, so I don't want to interfere with the normal, drupal-based, dynamic operation).
1. When I use any of the download tools (wget, sitesucker, I've not yet tried httrack) to download the site, the function drupal_get_html_head in file includes/common.inc outputs a directive ". This causes the downloaded pages to contain links which try to open the original site on the Internet, rather than the local file copy.
I'm afraid the directive didn't come through. Which one do you mean? :-)
The base directive (sorry, I used html and it was stripped out).
2. I would like to be logged in as a special user, named "archive", which is configured specially for archive purposes. For instance, it has no permission to search; it displays a custom block that tells that this is an archived version of the website, and states the date on which it was archived, etc. This works for me when I manually login as user "archive". How to get a downloading tool to login as a drupal user (they are good at http authentication, but have no understanding of drupal authentication).
If you log in as that user, you can check in the database and see what the session variable is for that user. wget (and probably the others) can be set to send a specific cookie with each request, and you can just give it that value. See the man page for the exact syntax, as I don't recall it at the moment.
But then, wouldn't I have to manually login first? I'm trying to create an automated procedure. Cheryl
On Wednesday 18 July 2007, Cheryl Chase wrote:
1. When I use any of the download tools (wget, sitesucker, I've not yet tried httrack) to download the site, the function drupal_get_html_head in file includes/common.inc outputs a directive ". This causes the downloaded pages to contain links which try to open the original site on the Internet, rather than the local file copy.
I'm afraid the directive didn't come through. Which one do you mean? :-)
The base directive (sorry, I used html and it was stripped out).
Ah, the base HTML tag. That means you're running 4.6, which is no longer supported. The base tag was removed in Drupal 4.7 because it caused too many problems. I suspect this is one of them. :-)
2. I would like to be logged in as a special user, named "archive", which is configured specially for archive purposes. For instance, it has no permission to search; it displays a custom block that tells that this is an archived version of the website, and states the date on which it was archived, etc. This works for me when I manually login as user "archive". How to get a downloading tool to login as a drupal user (they are good at http authentication, but have no understanding of drupal authentication).
If you log in as that user, you can check in the database and see what the session variable is for that user. wget (and probably the others) can be set to send a specific cookie with each request, and you can just give it that value. See the man page for the exact syntax, as I don't recall it at the moment.
But then, wouldn't I have to manually login first? I'm trying to create an automated procedure.
Cheryl
If you're comfortable with shell scripting you could script the login process, get the cookie that's passed back, and then hand that off to wget. Drupal quite deliberately makes it hard to get an authenticated account without human intervention, as 99.9% of the time any script trying to do so is a spammer, cracker, or otherwise someone you don't want. :-) -- Larry Garfield AIM: LOLG42 larry@garfieldtech.com ICQ: 6817012 "If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it." -- Thomas Jefferson
participants (2)
-
Cheryl Chase -
Larry Garfield