Hi folks, We've worked with jpcache in the past to produce static page caching for dynamic sites. We'd like to see the ability to create and edit pages with drupal that get saved as good old fashioned html files. Perhaps this would happen on a daily basis, perhaps only under severe load, perhaps only for paths with a .html extension, however not all drupal content needs to be loaded from drupal all of the time. Throttle is nice. The page caching is a good step in the right direction too.. But I'd like to cache an entire page so that php isn't even loaded at all to deliver the page. So, I'd just like to know what attempts have been done to do this in the past (if any), and what problems they had (so that hopefully we don't repeat the same ones). We've investigated this for other CMS's, but not for Drupal. Mike -- Mike Gifford, OpenConcept Consulting Free Software for Social Change -> http://openconcept.ca http://del.icio.us/mgifford/drupal http://flickr.com/photos/mgifford/ sets/1178930/ Latest CivicSpace Drupal Launch -> Hameed Law -> http:// www.hameedlaw.ca/
http://cvs.drupal.org/viewcvs/drupal/contributions/sandbox/jeremy/filecache/... would be great if you reimplemented this in modern drupal and submitted for core review. -moshe Mike Gifford wrote:
Hi folks,
We've worked with jpcache in the past to produce static page caching for dynamic sites. We'd like to see the ability to create and edit pages with drupal that get saved as good old fashioned html files.
Perhaps this would happen on a daily basis, perhaps only under severe load, perhaps only for paths with a .html extension, however not all drupal content needs to be loaded from drupal all of the time. Throttle is nice. The page caching is a good step in the right direction too.. But I'd like to cache an entire page so that php isn't even loaded at all to deliver the page.
So, I'd just like to know what attempts have been done to do this in the past (if any), and what problems they had (so that hopefully we don't repeat the same ones).
We've investigated this for other CMS's, but not for Drupal.
Mike -- Mike Gifford, OpenConcept Consulting Free Software for Social Change -> http://openconcept.ca http://del.icio.us/mgifford/drupal http://flickr.com/photos/mgifford/ sets/1178930/ Latest CivicSpace Drupal Launch -> Hameed Law -> http:// www.hameedlaw.ca/
On 9-Jan-06, at 4:22 PM, Moshe Weitzman wrote:
http://cvs.drupal.org/viewcvs/drupal/contributions/sandbox/jeremy/ filecache/?hideattic=0 would be great if you reimplemented this in modern drupal and submitted for core review.
Like many enhancements to Drupal, having them would be great. However, I would really like to know if this is seen as a general need in this community or not. I know it could go a long ways to reducing server loads, but would people use it? More importantly, would there be enough people interested in this to develop a reverse bounty to see that it gets implemented properly? I'm not sure at this point that I've got the client base to fund all of this development internally. We're building some requirements now, so will have a better sense of it fairly soon. Pages like this are very useful: http://drupal.org/node/2601 But it still isn't going to give you the snappy response that you'd get from a drupal page that has been cached as a static html page. A program like jpcache (http://www.jpcache.com/), would be a bit slower than a static page because apache would have to load php, and then load the cached file (from a file or db), however it would still use a fraction of the resources that a drupal site would. The only reference to jpcache in drupal.org is: http://drupal.org/node/12169 There are a lot of issues to be considered with static page caching and I'm not sure how (or if) it is possible to accommodate multi-site installs. Having a few folks contributing ideas, concerns, limitations would be great. Mike
Mike Gifford wrote:
Hi folks, We've worked with jpcache in the past to produce static page caching for dynamic sites. We'd like to see the ability to create and edit pages with drupal that get saved as good old fashioned html files. Perhaps this would happen on a daily basis, perhaps only under severe load, perhaps only for paths with a .html extension, however not all drupal content needs to be loaded from drupal all of the time. Throttle is nice. The page caching is a good step in the right direction too.. But I'd like to cache an entire page so that php isn't even loaded at all to deliver the page. So, I'd just like to know what attempts have been done to do this in the past (if any), and what problems they had (so that hopefully we don't repeat the same ones). We've investigated this for other CMS's, but not for Drupal. -- Mike Gifford, OpenConcept Consulting Free Software for Social Change -> http://openconcept.ca NGOs & Drupal -> http://del.icio.us/tag/drupal+ngo Latest NGO Launch -> Foundation for Iranian Studies - http://fis- iran.org/
On Tue, 10 Jan 2006 16:21:45 +0100, Mike Gifford <mike@openconcept.ca> wrote:
On 9-Jan-06, at 4:22 PM, Moshe Weitzman wrote:
http://cvs.drupal.org/viewcvs/drupal/contributions/sandbox/jeremy/ filecache/?hideattic=0 would be great if you reimplemented this in modern drupal and submitted for core review.
Like many enhancements to Drupal, having them would be great. However, I would really like to know if this is seen as a general need in this community or not.
Well, I do not know. Given the price of RAM and the ease of integrating memcached into Drupal... I have code under testing, I'll release it soon. I think Dries is right -- we need a caching API and insert the possibility to cache lots of stuff. For example, I have a big, big taxonomy which very rarely changes, it's a waste to process it again and again in taxonomy_get_tree so I store the structure for that vocabulary in memcached. And so on. Regards NK
At 4:38 PM +0100 10/1/06, Karoly Negyesi wrote:
Well, I do not know. Given the price of RAM and the ease of integrating memcached into Drupal... I have code under testing, I'll release it soon.
I did some benchmarking [1] of storing the menu tree in APC and concluded that the way we're doing it is just fine. The only thing you're going to save with APC or memcached is the database query and I think it would be an unusual use case where the complexity of APC was worthwhile to save one query per page. ...Richard. [1] http://lists.drupal.org/archives/development/2005-11/msg00896.html
On Jan 10, 2006, at 7:21 AM, Mike Gifford wrote:
But it still isn't going to give you the snappy response that you'd get from a drupal page that has been cached as a static html page.
Mike I am interested in seeing how Drupal could deliver static pages through a cache mechanism like http://www.squid-cache.org/. This seems to be very popular as part of a LAMP stack. Cheers, Kieran
On 10-Jan-06, at 8:38 AM, Kieran Lal wrote:
On Jan 10, 2006, at 7:21 AM, Mike Gifford wrote:
But it still isn't going to give you the snappy response that you'd get from a drupal page that has been cached as a static html page.
Mike I am interested in seeing how Drupal could deliver static pages through a cache mechanism like http://www.squid-cache.org/.
This seems to be very popular as part of a LAMP stack.
AKA reverse proxy. This is well understood technology with regards to dynamic apps. Most of the configuration is done at the Squid level, which understands the appropriate Apache headers looking at when/if content has changes. I believe this is a different part of the layer and can be discussed separately. -- Boris Mann Vancouver 778-896-2747 San Francisco 415-367-3595 SKYPE borismann http://www.bryght.com
On Jan 10, 2006, at 11:57 AM, Boris Mann wrote:
Mike I am interested in seeing how Drupal could deliver static pages through a cache mechanism like http://www.squid-cache.org/.
AKA reverse proxy. This is well understood technology with regards to dynamic apps. Most of the configuration is done at the Squid level, which understands the appropriate Apache headers looking at when/if content has changes.
It's well worth roping squid into the discussion. Well-funded sites with heavy load may have the resources to implement this, and it's very effective for load-management overall: Squid's cached pages are in memory/disk, and apache/php is never invoked when an object is intercepted by the reverse proxy. You take the load off of the apache server so it can focus on the rest of the dynamic tasks. This lets you tune apache better, turning KeepAlive off, etc. As configured, Drupal can't get much benefit from squid. It will cache the css and image files, but all of the php content includes headers that explicitly disable HTTP caching of any kind. This is because the same resource is accessed by both anonymous and registered users, so a single url can change radically. You might try and get tricky with public/private headers, and get Drupal to NOT start a session until the user is authenticated. But that's tricky. The second problem with proxy caching is that there's no clean way to expire content. It will live for as long as its Expires header indicates. So a newly-published article will take a little while to "kick in" on the cache server. Contrast that with the caching strategies discussed here, where the static content can be regenerated on the fly. I'd love to see Drupal become more cacheable, but the anonymous/ authenticated thing is quite a challenge. But the solutions discussed here can help: as long as the cached files are delivered without non-caching headers, they can be intercepted by squid for doubleplusgood caching performance. Allie Micka pajunas interactive, inc. http://www.pajunas.com scalable web hosting and open source strategies
On 12-Jan-06, at 6:22 AM, Allie Micka wrote:
On Jan 10, 2006, at 11:57 AM, Boris Mann wrote:
Mike I am interested in seeing how Drupal could deliver static pages through a cache mechanism like http://www.squid-cache.org/.
AKA reverse proxy. This is well understood technology with regards to dynamic apps. Most of the configuration is done at the Squid level, which understands the appropriate Apache headers looking at when/if content has changes.
It's well worth roping squid into the discussion. Well-funded sites with heavy load may have the resources to implement this, and it's very effective for load-management overall: Squid's cached pages are in memory/disk, and apache/php is never invoked when an object is intercepted by the reverse proxy. You take the load off of the apache server so it can focus on the rest of the dynamic tasks. This lets you tune apache better, turning KeepAlive off, etc.
I didn't mean to suggest that Squid isn't a good option for "doubleplusgood caching", just that it happens at a different layer and isn't part of a Drupal discussion, although it *is* part of a scalability discussion. As you say, make it easier for Drupal to be cached, and Squid will do good things for you. -- Boris Mann Vancouver 778-896-2747 San Francisco 415-367-3595 SKYPE borismann http://www.bryght.com
Mike Gifford wrote:
On 9-Jan-06, at 4:22 PM, Moshe Weitzman wrote:
http://cvs.drupal.org/viewcvs/drupal/contributions/sandbox/jeremy/ filecache/?hideattic=0 would be great if you reimplemented this in modern drupal and submitted for core review.
Like many enhancements to Drupal, having them would be great. However, I would really like to know if this is seen as a general need in this community or not.
When Jeremy produced his patch, it was felt that this would not be the case. However, nowadays there are many more high profile sites and I think a patch like this would help them.
I know it could go a long ways to reducing server loads, but would people use it? More importantly, would there be enough people interested in this to develop a reverse bounty to see that it gets implemented properly?
I'm not sure at this point that I've got the client base to fund all of this development internally. We're building some requirements now, so will have a better sense of it fairly soon.
Pages like this are very useful: http://drupal.org/node/2601
But it still isn't going to give you the snappy response that you'd get from a drupal page that has been cached as a static html page. A program like jpcache (http://www.jpcache.com/), would be a bit slower than a static page because apache would have to load php, and then load the cached file (from a file or db), however it would still use a fraction of the resources that a drupal site would.
The only reference to jpcache in drupal.org is: http://drupal.org/node/12169
There are a lot of issues to be considered with static page caching and I'm not sure how (or if) it is possible to accommodate multi-site installs. Having a few folks contributing ideas, concerns, limitations would be great.
I think that multi-site installs would probably be possible. I also think that you could maybe use htaccess to direct the user to a static file if it exists and to Drupal otherwise. Ie you'd have a directory /node and then there are cache files in it. The main problem I see are access permissions. If you have a purely public site, those would not be a problem.
Latest NGO Launch -> Foundation for Iranian Studies - http://fis- iran.org/
Say, can you get us a Drupal translation to Farsi? Cheers, Gerhard
When Jeremy produced his patch, it was felt that this would not be the case. However, nowadays there are many more high profile sites and I think a patch like this would help them.
I tend to agree. It looks like fairly simple job to implement. -- Dries Buytaert :: http://buytaert.net/
Please note that JPCache that Mike pointed out is GPL, so it is safe to integrate with it. Specifically check this benchmark results: http://www.jpcache.com/main.php?content=globalis Also, see a How To on PHPNuke. We could do something similar (cache only for anonymous users who are by far the bulk of visitors [including crawlers]). http://www.jpcache.com/main.php?content=howto Certainly worth considering ...
Op dinsdag 10 januari 2006 16:48, schreef Dries Buytaert:
When Jeremy produced his patch, it was felt that this would not be the case. However, nowadays there are many more high profile sites and I think a patch like this would help them.
I tend to agree. It looks like fairly simple job to implement.
I once did a dump (using wget and a spidering tool) of al pages in drupal into a static site. I cannot say anything about performance, but I can certainly say that we have a hell of a lot of pages. The site had approx. 50 nodes, but about 700 html pages :). Bèr -- | Bèr Kessels | webschuur.com | website development | | Jabber & Google Talk: ber@jabber.webschuur.com | http://bler.webschuur.com | http://www.webschuur.com |
But it still isn't going to give you the snappy response that you'd get from a drupal page that has been cached as a static html page. A program like jpcache (http://www.jpcache.com/), would be a bit slower than a static page because apache would have to load php, and then load the cached file (from a file or db), however it would still use a fraction of the resources that a drupal site would.
I fail to see the advantage of using jpcache over writing static files. Care to elaborate? -- Dries Buytaert :: http://www.buytaert.net/
JPCache allows you to do different things: - Caching to flat files, or to MySQL - Caching only for anonymous users, still dynamic for logged in users. On 1/10/06, Dries Buytaert <dries@buytaert.net> wrote:
But it still isn't going to give you the snappy response that you'd get from a drupal page that has been cached as a static html page. A program like jpcache (http://www.jpcache.com/), would be a bit slower than a static page because apache would have to load php, and then load the cached file (from a file or db), however it would still use a fraction of the resources that a drupal site would.
I fail to see the advantage of using jpcache over writing static files. Care to elaborate?
-- Dries Buytaert :: http://www.buytaert.net/
On 10 Jan 2006, at 18:47, Khalid B wrote:
JPCache allows you to do different things:
- Caching to flat files, or to MySQL - Caching only for anonymous users, still dynamic for logged in users.
Conclusion: we don't need JPCache and are better off with our own solution? -- Dries Buytaert :: http://www.buytaert.net/
No, I meant it is configurable, and depends on how you set it up. Check the two links I sent before for details. Of course, Squid is another viable option as well. On 1/10/06, Dries Buytaert <dries@buytaert.net> wrote:
On 10 Jan 2006, at 18:47, Khalid B wrote:
JPCache allows you to do different things:
- Caching to flat files, or to MySQL - Caching only for anonymous users, still dynamic for logged in users.
Conclusion: we don't need JPCache and are better off with our own solution?
-- Dries Buytaert :: http://www.buytaert.net/
Khalid B wrote:
Of course, Squid is another viable option as well.
Not viable for everyone, though. It seems like it isn't very likely that "normal" users would have squid proxies set up. Yet writing static files to a /nodes folder seems really accessible to most cases. I'd be very interested in static page caching. Whether we use a library or develop our own should only be a question of what is easiest to implement/maintain. Robert
On Tue, 2006-01-10 at 19:19 +0100, Robert Douglass wrote:
Khalid B wrote:
Of course, Squid is another viable option as well.
Not viable for everyone, though. It seems like it isn't very likely that "normal" users would have squid proxies set up. Yet writing static files to a /nodes folder seems really accessible to most cases.
I'd be very interested in static page caching. Whether we use a library or develop our own should only be a question of what is easiest to implement/maintain.
Robert
Blocks and menus can present a challenge... what about a lightweight blocks.php and menus.php. use ajax or SSI's to load them into cached pages? microcontent servers.
on 01/10/2006 12:34 PM Darrel O'Pry said the following: <snip>
Blocks and menus can present a challenge... what about a lightweight blocks.php and menus.php. use ajax or SSI's to load them into cached pages? microcontent servers.
True. A page cache won't work for all sites. But where it does fit, it really makes sense to support it nicely. This doesn't address sub-page element caching which should also be done as much as possible where it would be effective. Joe
Darrel O'Pry wrote:
Blocks and menus can present a challenge... what about a lightweight blocks.php and menus.php. use ajax or SSI's to load them into cached pages? microcontent servers.
Well in my books cache never means 'cache forever.' Therefore blocks and menus are fine - if there is some latency its not the end of the world for most people... But - a few must have features are: -static page cache 'time to live' setting -manual 'rebuild' static page cache button of some sort -theme change detection to 'urge' admins to update their static page cache -setting for hook_cron - i.e. rebuild none/some/all static page cache items every so often andre ___________________________________________________________ Yahoo! Photos NEW, now offering a quality print service from just 8p a photo http://uk.photos.yahoo.com
on 01/10/2006 01:44 PM andre said the following:
Darrel O'Pry wrote:
Blocks and menus can present a challenge... what about a lightweight blocks.php and menus.php. use ajax or SSI's to load them into cached pages? microcontent servers.
Well in my books cache never means 'cache forever.' Therefore blocks and menus are fine - if there is some latency its not the end of the world for most people... But - a few must have features are:
-static page cache 'time to live' setting -manual 'rebuild' static page cache button of some sort -theme change detection to 'urge' admins to update their static page cache -setting for hook_cron - i.e. rebuild none/some/all static page cache items every so often
Yes, there would be a need for cron job to help with garbage collection in the event that no php pages were called. Otherwise whenever cacheable php pages are called, the garbage collection at least has a possibility of being run. building a static site completely is one method cms' use to deploy sites for production. Another is to build the cache as the files are requested. This is the method proposed. For completeness hooks for jpcache to do the things above and mentioned earlier are here: http://sourceforge.net/tracker/index.php?func=detail&aid=643773&group_id=239... The method linked at the jpcache site for integrating jpcache misses any control other than cache expire time. But I want to be clear. I am not tied to jpcache. The internal cache mechanism could be expanded. We are interested in gauging the interest, whether existing methods were preferred, etc. Also whether the anon cache would be better moved into a more robust cache module. Joe
andre
___________________________________________________________ Yahoo! Photos – NEW, now offering a quality print service from just 8p a photo http://uk.photos.yahoo.com
Dries Buytaert wrote:
On 10 Jan 2006, at 18:47, Khalid B wrote:
JPCache allows you to do different things:
- Caching to flat files, or to MySQL - Caching only for anonymous users, still dynamic for logged in users.
Conclusion: we don't need JPCache and are better off with our own solution?
indeed. we already do all of the above except for flat files which aren't exactly rocket science. here is a low tech but perfectly reasonable version of flat files in drupal: http://drupal.org/node/29970. requires mod_rewrite ... now that drupal can bootstrap only the DB, I think this script can be rewritten as a pure Drupal script and the config file removed. -moshe
on 01/10/2006 11:25 AM Dries Buytaert said the following:
But it still isn't going to give you the snappy response that you'd get from a drupal page that has been cached as a static html page. A program like jpcache (http://www.jpcache.com/), would be a bit slower than a static page because apache would have to load php, and then load the cached file (from a file or db), however it would still use a fraction of the resources that a drupal site would.
I fail to see the advantage of using jpcache over writing static files. Care to elaborate?
For the static pages it would not. For logged in users, you can cache the pages in another area, memcached, or db. Also some sites on public web hosts might not want to write back to the public_html space. We have patches to jpcache to control garbage collection and expiring pages. Since drupal knows what changes are made and what pages it would affect, those cached pages can be expired/deleted. This would give a much longer cache time and hit rate. Also have extended jpcache to write a static cache for another project. We were thinking more about a flexible cache mechanism that different backends could use. jpcache is flexible for full page caching but doesn't handle subpage caching. I've tested a memcached backend for jpcache as well. Three are a couple of different use cases, like anon and identified users. We aren't tied to jpcache but have used it for much this purpose very successfully. The drawback for a page cache would be for something that might change with each pageview ( like if you have a simple banner system that subs in regular ad links and images each view). Joe
-- Dries Buytaert :: http://www.buytaert.net/
On Mon, 09 Jan 2006 16:22:33 -0500 Moshe Weitzman <weitzman@tejasa.com> wrote:
http://cvs.drupal.org/viewcvs/drupal/contributions/sandbox/jeremy/filecache/...
would be great if you reimplemented this in modern drupal and submitted for core review.
Okay, I have created a patch for the current CVS version of Drupal: http://drupal.org/node/45414
From the issue summary: "This patch introduces an option for file-based caching. When enabled, a 'cache' subdirectory is created in the files/ directory, and cache data is written there. The best performance improvement will be seen when both the file cache is enabled and a minimum cache lifetime is set."
Discussion, testing, suggestions, improvements, benchmarking, etc, all appreciated. Cheers, -Jeremy
-moshe
Mike Gifford wrote:
Hi folks,
We've worked with jpcache in the past to produce static page caching for dynamic sites. We'd like to see the ability to create and edit pages with drupal that get saved as good old fashioned html files.
Perhaps this would happen on a daily basis, perhaps only under severe load, perhaps only for paths with a .html extension, however not all drupal content needs to be loaded from drupal all of the time. Throttle is nice. The page caching is a good step in the right direction too.. But I'd like to cache an entire page so that php isn't even loaded at all to deliver the page.
So, I'd just like to know what attempts have been done to do this in the past (if any), and what problems they had (so that hopefully we don't repeat the same ones).
We've investigated this for other CMS's, but not for Drupal.
Mike -- Mike Gifford, OpenConcept Consulting Free Software for Social Change -> http://openconcept.ca http://del.icio.us/mgifford/drupal http://flickr.com/photos/mgifford/ sets/1178930/ Latest CivicSpace Drupal Launch -> Hameed Law -> http:// www.hameedlaw.ca/
participants (17)
-
Allie Micka -
andre -
Boris Mann -
Bèr Kessels -
Darrel O'Pry -
Dries Buytaert -
Dries Buytaert -
Gerhard Killesreiter -
Jeremy Andrews -
Joe Stewart -
Karoly Negyesi -
Khalid B -
Kieran Lal -
Mike Gifford -
Moshe Weitzman -
Richard Archer -
Robert Douglass