[drupal-devel] [bug] apache and drupal both compressing pages
Issue status update for http://drupal.org/node/13224 Post a follow up: http://drupal.org/project/comments/add/13224 Project: Drupal Version: cvs Component: base system Category: bug reports Priority: critical Assigned to: killes@www.drop.org Reported by: bertboerland@www.drop.org Updated by: Arto Status: patch (code needs review) If the caching mechanism is to be patched, that'd be a good time to also tackle the problem that it's currently unusable on PostgreSQL: http://drupal.org/node/26369 Arto Previous comments: ------------------------------------------------------------------------ Sat, 20 Nov 2004 16:22:55 +0000 : bertboerland@www.drop.org hi this weekend I went ahead and upgraded drupal from 4.4 to 4.5 on a machine, I dint have time/priority (in Dutch these words are almost the same :-) sooner. The core went fine and as usual lots of unpacking, configuring for all the modules I want to experiment with. After some time I tried to access the site in a -just-to-be-sure- moment from windows machine with an IE browser without being logged in. I found some strange behavior. Sometimes IE wanted to download the file, sometimes it showed the file (being a binary) and sometimes I got a valid page. I never got this from my firefox on both windows and Linux while being logged in. Later I could reproduce this on a firefox browser without being logged in. After some time / thinking I think I found out that both apache (my webserver) and drupal were compressing (gzipping) the page resulting in a browser that was able to uncompress the page once. But if it was a cached page and drupal was zipping this and Apache as well, the browser was just displaying a gzipped page. I have a rather old setup with an ancient version of apache but does this sound possible? Since I am hosting (some) other content outside the drupal dir, I would like to keep apache gzipping pages so is there an option to stop drupal from compressing pages? ------------------------------------------------------------------------ Sat, 20 Nov 2004 16:32:16 +0000 : killes@www.drop.org Which method use your Apache to gzip the pages? Which version is it? I find it strange that we get this report only several weeks after the release. I suspect that your particualr set up is borken. ------------------------------------------------------------------------ Sat, 20 Nov 2004 16:45:23 +0000 : bertboerland@www.drop.org The reason I waited so long was that I wanted others to find out this :-) Mind you, I *think* it is the fact that they are both compressing the cached pages. You might want to try it out for yourself by going to http://willy.boerland.com/myblog and in case a there is a cached page you will end up by downloading a binary that once renamed to a gz and unpacked, will give you the html file. drupal 4.5 apache: 1.3.26 (with *all* security patches) php: 4.2.3 Content-Encoding: gzip And yes, since I quess 100+ sites are already at 4.5 I think it is related to my setup. However, I understand correctly there i no way of having drupal stop sending out compressed pages? There is no configuration option? ------------------------------------------------------------------------ Sat, 20 Nov 2004 17:06:05 +0000 : killes@www.drop.org Yes, I get funny garbage, too. The apache version you run is the one Debian provides. It is still used very often, but still nobody but you seems to experience the problem, so I still think that somehow your setup is broken. Apache should recognize that the page is already gzipped. Do you use mod_gzip or libz? ------------------------------------------------------------------------ Sat, 20 Nov 2004 17:57:14 +0000 : bertboerland@www.drop.org Since this one is more or less critical for me but not for the RotW, re-setting priority to "nomal". I am using zlib btw. Did you try to save the file and unzip it? And I am correct that there is no option to disable cached paged from being zipped? ------------------------------------------------------------------------ Sat, 20 Nov 2004 18:30:23 +0000 : bertboerland@www.drop.org I think I "solved" this for me by changing the bootstrap.inc setting this to postponed, if no-one else experiences this problem within 3 weeks I will close this issue. ------------------------------------------------------------------------ Sat, 20 Nov 2004 18:31:19 +0000 : killes@www.drop.org I tried to download it now, but you must have changed something, the page renders fine. You are right, there is no option to disable gzipped cache. I'd really like to discover the reason for this problem. Normally, zlib should check for gzipped content and do not gzip it again. Drupal sends the appropriate header. ------------------------------------------------------------------------ Sat, 20 Nov 2004 18:32:25 +0000 : killes@www.drop.org Locking could have prevented this. ;) ------------------------------------------------------------------------ Sun, 21 Nov 2004 12:36:23 +0000 : axel@drupal.ru
I found some strange behavior. Sometimes IE wanted to download the file, sometimes it showed the file (being a binary) and sometimes I got a valid page.
Sometimes I got same thing in Mozilla Firefox with differrent sites (not sure, but not only with Drupal) when I work trough my local proxies (I use chain of two proxies on my Debian box - one for banner deleting (privoxy) and second for offline page caching (wwwoffle)). I don't dig to details of such behaviour, but simply switch off proxies and reload browser's page help to solve problem - then I again switch on proxies and page loaded ok. May be you also access site through proxy? Then you need to check this proxy settings, I think. ------------------------------------------------------------------------ Sun, 21 Nov 2004 21:13:26 +0000 : bertboerland@www.drop.org Axel, no it was not related to proxys or even the browser. It was related to cached pages being zipped by drupal and sent to apache that was ziping the page as well, resulting in a zipped zipped page for the client that only once unzipped the page and hence displayed a binary ------------------------------------------------------------------------ Tue, 21 Dec 2004 17:10:12 +0000 : bertboerland@www.drop.org It seems like I was not the only one [1]. There is not enough information about the others setup (/please post here!/) to find a generic cause, but it is sure that some setups will cause zipped pages to be zipped. [1] http://drupal.org/node/14569 ------------------------------------------------------------------------ Thu, 03 Feb 2005 15:03:17 +0000 : leafish_dylan Has anybody found a better way to fix this problem? Disabling the storage of compressed cache pages in bootstrap.inc works, but it's ugly. Is an uncompressed cache likely to take up a lot of space in the database? If not, perhaps a global "gzip/zlib compression" option for Drupal would be simpler. ------------------------------------------------------------------------ Sat, 05 Feb 2005 14:40:16 +0000 : Anonymous As author of the gzipped cache patch I'd really appreciate if the people who experience this problem could supply detailed(!) info on their setup and the headers sent both by the browser and the server. I'd like to fix the problem (if there is one on Drupal's side) or at least document correct server settings if the problem is there. ------------------------------------------------------------------------ Sat, 05 Feb 2005 16:54:42 +0000 : bertboerland@www.drop.org Here some info, please ask if you need more: apache 1.3.26 mmcache (version unknown) drupal 4.5.1 Note: I didnt post (and thought abaout!) the MMcache between drupal and apache. The configuration of mmcache is extension="mmcache.so" mmcache.shm_size="16" mmcache.cache_dir="/tmp/mmcache" mmcache.enable="1" mmcache.optimizer="1" mmcache.check_mtime="1" mmcache.debug="0" mmcache.filter="" mmcache.shm_max="0" mmcache.shm_ttl="0" mmcache.shm_prune_period="0" mmcache.shm_only="0" mmcache.compress="1" headers: http://willy.boerland.com/myblog/ GET /myblog/ HTTP/1.1 Host: willy.boerland.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Cookie: PHPSESSID=307f07c169e3bbcbe92d150026c97583 HTTP/1.x 200 OK Date: Sat, 05 Feb 2005 16:45:49 GMT Server: Apache-AdvancedExtranetServer/1.3.26 X-Powered-By: PHP/4.2.3 Content-Encoding: gzip Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Transfer-Encoding: chunked Content-Type: text/html; charset=utf-8 ---------------------------------------------------------- ------------------------------------------------------------------------ Mon, 21 Feb 2005 10:53:02 +0000 : coma I have exactly the same problem my setup it's currently apache 2.0.52, php 4.3.10 and drupal 4.5.2. The relevant info it's: from apache: <Location /> # Insert filter SetOutputFilter DEFLATE # Netscape 4.x has some problems... BrowserMatch ^Mozilla/4 gzip-only-text/html # Netscape 4.06-4.08 have some more problems BrowserMatch ^Mozilla/4\.0[678] no-gzip # MSIE masquerades as Netscape, but it is fine BrowserMatch \bMSIE !no-gzip !gzip-only-text/html # Don't compress images SetEnvIfNoCase Request_URI \ \.(?:gif|jpe?g|png)$ no-gzip dont-vary # Make sure proxies don't deliver the wrong content Header append Vary User-Agent env=!dont-vary </Location> In php.ini: zlib.output_compression = On And I can assure you that my problem was that last element, changing zlib.output_compression to off solved the problem, and the output of all php scripts (not only drupal) it's compressed by either drupal or apache. You can test this things with wget: wget --header "Accept-Encoding: gzip,deflate" http://localhost/ The file wget drops should be a gzippped html, if you do a zcat and get a plain html it's all good, if you get another gzipped file it has been double compressed. ------------------------------------------------------------------------ Sat, 02 Apr 2005 19:38:42 +0000 : moshe weitzman i too am seeing strange behavior here. specifically, php is simply dieing after printing out the $cache->data in drupal_page_header(). truly, i don't think drupal should be storing gzipped cache, nor do i think it should perform gzip at all. these are better handled at php or apache layers. ------------------------------------------------------------------------ Sat, 23 Apr 2005 14:06:14 +0000 : mjr Apparently i am observing a new variant of that problem in one of the subsites of my 4.6 test installation: after enebaling cacheing, lynx, w3m and the gecko based browsers display garbage. lynx tells me it wants to load a file named index.html.gz. wget http://beate/drupal46/ on that site gives a readable html page. w3m -dump_head tells me: Received cookie: PHPSESSID=a6916d57d08727ab8ed945ddac4ecf2c HTTP/1.1 200 OK Date: Sat, 23 Apr 2005 13:56:19 GMT Server: Apache/2.0.53 (Debian GNU/Linux) PHP/4.3.10-10 mod_ssl/2.0.53 OpenSSL/0.9.7e mod_perl/1.999.20 Perl/v5.8.4 X-Powered-By: PHP/4.3.10-10 Set-Cookie: PHPSESSID=a6916d57d08727ab8ed945ddac4ecf2c; expires=Mon, 16 May 2005 17:29:39 GMT; path=/ Expires: Sun, 19 Nov 1978 05:00:00 GMT Last-Modified: Sat, 23 Apr 2005 13:56:19 GMT Cache-Control: no-store, no-cache, must-revalidate Cache-Control: post-check=0, pre-check=0 Pragma: no-cache Content-Encoding: gzip Vary: Accept-Encoding Connection: close Content-Type: text/html; charset=utf-8 Apparently, the site sends uncompressed pages but tells the browser it is compressed. BTW, links2 and dillo do disply the page but will not allow me to log in and disable cacheing. Any clues? Thanks Michael ------------------------------------------------------------------------ Sat, 23 Apr 2005 14:08:42 +0000 : mjr Perhaps I should mention that zlib.output_compression = Off in my php.ini ------------------------------------------------------------------------ Sat, 23 Apr 2005 18:43:30 +0000 : bertboerland@www.drop.org /Apparently i am observing a new variant of that problem / Please open a new bug report, since your problem is likely unrelated to this problem, open an new problem and you might want to use a link to this problem, but dont use this problem for other than descibed. ------------------------------------------------------------------------ Sat, 04 Jun 2005 23:53:22 +0000 : mjr no, it is not unrelated. After some investigation, i found out that the garbage is being stored in drupals cache. This has been reported in this thread. My present workaround is to completly disable drupals caching (and to manually delete the compressed pages from the database.) Whats the use of letting drupal compress pages and store these int its cache table? Just saving a few cycles of CPU time? On the cost of letting the web server and its client decide wether to transfer compressed or uncompressed data - they have the protocol to do that, drupal has not. Michael ------------------------------------------------------------------------ Mon, 06 Jun 2005 16:48:30 +0000 : killes@www.drop.org We save cpu cycles and storage space. Apache only needs to check the Encoding header which will be easy. I do not know why zlib output compression doesn't check those headers for some (or all?) people. ------------------------------------------------------------------------ Fri, 10 Jun 2005 23:32:40 +0000 : bluec I have the same problem with Drupal 4.6.1: If the site delivers a cached page it is gziped twice and the result is garbage in my browser window. I tried two differnet browsers: Firefox and Opera and it's the same problem with both of them. I recorded the HTTP headers: - Cache turned on - zlib-compression turned on - Result: Garbage http://kim.sshtun.v63.org:22003/index.html GET /index.html HTTP/1.1 Host: kim.sshtun.v63.org:22003 User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.8) Gecko/20050511 Firefox/1.0.4 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Language: de-de,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://kim.sshtun.v63.org:22003/fachschaft/ Cookie: PHPSESSID=6ddbf3d5a32d-CHANGED-BY-CHRIS HTTP/1.x 200 OK Date: Fri, 10 Jun 2005 23:16:08 GMT Server: Apache/2.0.54 (Gentoo/Linux) mod_ssl/2.0.54 OpenSSL/0.9.7e PHP/5.0.4 X-Powered-By: PHP/5.0.4 Expires: Mon, 26 Jul 1997 05:00:00 GMT Last-Modified: Fri, 10 Jun 2005 23:11:59 GMT Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0 Pragma: no-cache Set-Cookie: PHPSESSID=6ddbf3d5a32d20-CHANGED-BY-CHRIS; expires=Mon, 04 Jul 2005 02:49:28 GMT; path=/ Etag: "03cff5fcc105e72dfc2671d3ffdb285c" Content-Encoding: gzip Content-Length: 5024 Keep-Alive: timeout=15, max=95 Connection: Keep-Alive Content-Type: text/html; charset=utf-8 Now I turned off compression by setting in my .htaccess php_flag zlib.output_compression On and the result is OK: http://kim.sshtun.v63.org:22003/index.html GET /index.html HTTP/1.1 Host: kim.sshtun.v63.org:22003 User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.8) Gecko/20050511 Firefox/1.0.4 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Language: de-de,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://kim.sshtun.v63.org:22003/index.html Cookie: PHPSESSID=6ddbf3d5a32d20-CHANGED-BY-CHRIS HTTP/1.x 200 OK Date: Fri, 10 Jun 2005 23:17:08 GMT Server: Apache/2.0.54 (Gentoo/Linux) mod_ssl/2.0.54 OpenSSL/0.9.7e PHP/5.0.4 X-Powered-By: PHP/5.0.4 Expires: Mon, 26 Jul 1997 05:00:00 GMT Last-Modified: Fri, 10 Jun 2005 23:11:59 GMT Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0 Pragma: no-cache Set-Cookie: PHPSESSID=6ddbf3d5a32d208-CHANGED-BY-CHRIS; expires=Mon, 04 Jul 2005 02:50:28 GMT; path=/ Etag: "03cff5fcc105e72dfc2671d3ffdb285c" Content-Encoding: gzip Vary: Accept-Encoding Content-Length: 5054 Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: text/html; charset=utf-8 Turning off the cache is a solution as well. As soon as cache and zlib are operating the result is garbage. If you need any other information just let me know. I can turn on compression at any time and test other solutions. Chris ------------------------------------------------------------------------ Mon, 27 Jun 2005 18:10:12 +0000 : aries The same. Interesting, only one page doing this, all others are not. -- Aries http://aries.mindworks.hu ------------------------------------------------------------------------ Thu, 07 Jul 2005 14:46:34 +0000 : fago quite strange. i never had problems with this, then i upgraded from 4.6.1 to drupal 4.6.2... now i must turn off the cache, or i've the troubles mentioned here... (i haven't touched the apache or php config in the meanwhile) ------------------------------------------------------------------------ Thu, 14 Jul 2005 09:47:32 +0000 : bertboerland@www.drop.org yet another one [2], closed than one, raised importance here. [2] http://drupal.org/node/25326 ------------------------------------------------------------------------ Sat, 16 Jul 2005 19:52:26 +0000 : bertboerland@www.drop.org killes sigested that is want apache that compressed the page but drupal and php was the first one. after some testing, i found out he was correct! to solve this 1) either comment out the gzip lines in bootstrap.inc 2) put /etc/php.ini "zlib.output_compression = On" to OFF (real) Fix needed. Drupal should look if the page was zipped in the first place.... ------------------------------------------------------------------------ Fri, 29 Jul 2005 12:17:46 +0000 : killes@www.drop.org Attachment: http://drupal.org/files/issues/gzip.patch (2.45 KB) Here's a patch that needs some testing. ------------------------------------------------------------------------ Fri, 29 Jul 2005 13:18:51 +0000 : Dries Why do we set $do_cache to TRUE in the nested if's? $do_cache is already set to TRUE higher up so we're just overwriting TRUE with TRUE? Also, can we rename $do_cache to $cache? That is more Drupal-ish. Does that solve all apache-Drupal-caching problems or only some? I vaguely remember we agreed to add a setting, because that was perceived the only True Solution. ------------------------------------------------------------------------ Fri, 29 Jul 2005 18:22:04 +0000 : killes@www.drop.org Attachment: http://drupal.org/files/issues/gzip_0.patch (2.46 KB) I've updated the patch. The patch is intended to cure all the problems we had. It needs testing, though. I don't recall to agreeing that a setting would be needed. Just for the record: Steven measured the ratio of serving gzipped cached pages to re-unzipped pages on drupal.org to be about 5.1:1. That is from 6 page requests which we serve from the cache only one needs the cache to be unzipped. This is probably due to crawlers who only speak http 1.0. ------------------------------------------------------------------------ Sat, 30 Jul 2005 14:33:42 +0000 : bertboerland@www.drop.org people who expierenced this problem, please try the patch of killes and report here if successful or not. i will patch as well ------------------------------------------------------------------------ Mon, 01 Aug 2005 14:55:47 +0000 : moshe weitzman In my opinion, this is a case of misplaced optimization. Very little CPU is required to gzip a document. Why else would Apache and PHP provide an option to do this *for every page*. PHP knows that all its pages will be dynamic yet it still offers this option. My recommendation is this feature entirely, and take the win in simplicity/code reduction. Furthermore, we won't have any more issues like this one, which has lingered unfixed for 9 months. Admins who want gzip will elect to do so in their php.ini or .htaccess.
participants (1)
-
Arto