No suitable nodes available at RackspaceCloud (Mosso)
Hi, At RackspaceCloud (former Mosso) I've been plagued with a very unfortunate problem that i crippling both my work and the work of my clients -- namely the infamous error message "Unfortunately there were no suitable nodes available to serve this request." Those of you at RS Cloud must have bumped into it. It is cryptic and happens unpredictably. The cloud is very stable and scalable, but for any a little bit heavier Drupal installation people do start getting these errors. *Basically, it is a generic error thrown by load balanced systems that occurs as a result of a script exceeding a maximum timeout value (not the PHP timeout value!) If a client connection does not receive a response from the server after approximately 30 to 60 seconds the load balancer will close the connection and the client will immediately receive the error message. In most cases, the script will continue to execute until it reaches completion, throws an error, or times out on the server, but the client will not see the page load as expected and will instead receive this error.* I've used Boost for anonymous pages, Parallel, Memcache, etc., all of which helped and anonymous users *usually* don't get this error. The problem is with admin or any other a bit heavier work of logged in users. Even for basic Drupal websites with not too many modules! Pages like the list of modules, or the status page, i.e. heavy database or file requests, or API calls in PHP, are very likely to time out. Over the past year I've had a number of discussions with techs and admins at that cloud, but the situation is unresolved. They recognize the problem but maintain this is due to the special/unusual setup they use for their cloud. It is not a problem for some other CMS / frameworks. E.g. a very heavy MediaWiki installation runs just fine. Drupal seems to be less compatible with their system, somehow, somewhere. *Now, why do I mention all this in the development list? I've been intrigued by one little ray of hope in their words: "if a client connection does not receive a response from the server after approximately 30 to 60 seconds the load balancer will close the connection and the client will immediately receive the error message". Their techs said if I were able to emit any kind of intermediary response to the client during rendering of the page, then this would be solved. * Indeed, a bit like the Batch API works in Drupal (with that I often run night-long scripts without problems). I wonder, maybe this is a more generic problem for any system that employs load balancers? *So my questions to you, colleagues, is -- do you see any place in Drupal processing chain that could be used, and approximately how, to make sure that the load balancer keeps the connection opened.* If you have any ideas, wild or proven, I will be happy to test and develop them further and bring them back to the community, of course. If this succeeds, I think many of us will be relieved (and able to focus on development again!) Thank you for any ideas - on and off this list. Best regards, Tomáš / Vacilando
Just a really quick one, you have probably tried it: if the PHP timeout is put at 29 seconds, it will preempt the other error, but you can handle it with any usual PHP error handling mechanism for live sites. Just a quick idea while you get to the bottom of things, Victor Kane http://awebfactory.com.ar On Thu, Feb 18, 2010 at 6:42 AM, Tomáš Fülöpp (vacilando.org) < tomi@vacilando.org> wrote:
Hi,
At RackspaceCloud (former Mosso) I've been plagued with a very unfortunate problem that i crippling both my work and the work of my clients -- namely the infamous error message "Unfortunately there were no suitable nodes available to serve this request." Those of you at RS Cloud must have bumped into it. It is cryptic and happens unpredictably. The cloud is very stable and scalable, but for any a little bit heavier Drupal installation people do start getting these errors.
*Basically, it is a generic error thrown by load balanced systems that occurs as a result of a script exceeding a maximum timeout value (not the PHP timeout value!) If a client connection does not receive a response from the server after approximately 30 to 60 seconds the load balancer will close the connection and the client will immediately receive the error message. In most cases, the script will continue to execute until it reaches completion, throws an error, or times out on the server, but the client will not see the page load as expected and will instead receive this error.*
I've used Boost for anonymous pages, Parallel, Memcache, etc., all of which helped and anonymous users *usually* don't get this error. The problem is with admin or any other a bit heavier work of logged in users. Even for basic Drupal websites with not too many modules! Pages like the list of modules, or the status page, i.e. heavy database or file requests, or API calls in PHP, are very likely to time out.
Over the past year I've had a number of discussions with techs and admins at that cloud, but the situation is unresolved. They recognize the problem but maintain this is due to the special/unusual setup they use for their cloud. It is not a problem for some other CMS / frameworks. E.g. a very heavy MediaWiki installation runs just fine. Drupal seems to be less compatible with their system, somehow, somewhere.
*Now, why do I mention all this in the development list? I've been intrigued by one little ray of hope in their words: "if a client connection does not receive a response from the server after approximately 30 to 60 seconds the load balancer will close the connection and the client will immediately receive the error message". Their techs said if I were able to emit any kind of intermediary response to the client during rendering of the page, then this would be solved. * Indeed, a bit like the Batch API works in Drupal (with that I often run night-long scripts without problems). I wonder, maybe this is a more generic problem for any system that employs load balancers?
*So my questions to you, colleagues, is -- do you see any place in Drupal processing chain that could be used, and approximately how, to make sure that the load balancer keeps the connection opened.* If you have any ideas, wild or proven, I will be happy to test and develop them further and bring them back to the community, of course. If this succeeds, I think many of us will be relieved (and able to focus on development again!)
Thank you for any ideas - on and off this list.
Best regards,
Tomáš / Vacilando
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Tomáš Fülöpp (vacilando.org) schrieb:
At RackspaceCloud (former Mosso) I've been plagued with a very unfortunate problem that i crippling both my work and the work of my clients -- namely the infamous error message "Unfortunately there were no suitable nodes available to serve this request." Those of you at RS Cloud must have bumped into it. It is cryptic and happens unpredictably.
A client of mine has moved MySQL from the cloud to a RS virtual server to get rid of this problem.
The cloud is very stable and scalable, but for any a little bit heavier Drupal installation people do start getting these errors.
I suspect that Drupal may send too many SQL queries for RS' liking. Cheers, Gerhard -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkt9F2IACgkQfg6TFvELooRxmgCeLjdAySmQrFL5sPv/TTj8OoEa sZcAniD8j7s1/eJ5q66e71PU1eANDu8/ =YSXP -----END PGP SIGNATURE-----
(Interesting, Brian; I also were promised shell pretty soon about a year ago. It's a shame - MediaTemple has shell *and *also a breakdown of compute cycles per script...) Anyway -- Victor's note about shortening PHP timeout brought me to thinking about measuring the time since the start of the execution and issuing flush() each time the process might time out. Two questions: 1. what is the most suitable Drupal function for this -- it needs to be something that runs regularly and for all kind of pages 2. for Drupal, is it enough to issue flush() or is ob_end_flush() also needed, or something else Thanks a million for any ideas; Tomáš / Vacilando On Thu, Feb 18, 2010 at 15:46, Brian Vuyk <brian@brianvuyk.com> wrote:
I've run into this with a few of my client sites, but they haven't even been high-traffic sites.
Personally, I just don't think the RS Cloud is a good match for Drupal. Combine that with the recent security issues they've had, occasional inexplicable downtime, the 'no suitable nodes' and the lack of a shell, and I am moving my sites away as quick as I can.
The shell issue is really sensitive for me - about 14 months ago, my previous host ran into... issues... and could no longer offer hosting. So, I was in a pinch and Rackspace (then Mosso) looked very good apart from the lack of a shell. I talked to their customer service reps, and was informed that shell access for the cloud was in pre-release testing, and was scheduled to go live the next week.
In a burst of poor judgement, I decided that the package they offered was good enough to do without shell access for a week, so I bought in, and transferred my sites. 14 months later, shell access still hasn't been released, and I've had to move all my more critical / development-intensive sites off of their service in the meantime.
Brian
Tomáš Fülöpp (vacilando.org) wrote:
Hi,
At RackspaceCloud (former Mosso) I've been plagued with a very unfortunate problem that i crippling both my work and the work of my clients -- namely the infamous error message "Unfortunately there were no suitable nodes available to serve this request." Those of you at RS Cloud must have bumped into it. It is cryptic and happens unpredictably. The cloud is very stable and scalable, but for any a little bit heavier Drupal installation people do start getting these errors.
*Basically, it is a generic error thrown by load balanced systems that occurs as a result of a script exceeding a maximum timeout value (not the PHP timeout value!) If a client connection does not receive a response from the server after approximately 30 to 60 seconds the load balancer will close the connection and the client will immediately receive the error message. In most cases, the script will continue to execute until it reaches completion, throws an error, or times out on the server, but the client will not see the page load as expected and will instead receive this error.*
I've used Boost for anonymous pages, Parallel, Memcache, etc., all of which helped and anonymous users *usually* don't get this error. The problem is with admin or any other a bit heavier work of logged in users. Even for basic Drupal websites with not too many modules! Pages like the list of modules, or the status page, i.e. heavy database or file requests, or API calls in PHP, are very likely to time out.
Over the past year I've had a number of discussions with techs and admins at that cloud, but the situation is unresolved. They recognize the problem but maintain this is due to the special/unusual setup they use for their cloud. It is not a problem for some other CMS / frameworks. E.g. a very heavy MediaWiki installation runs just fine. Drupal seems to be less compatible with their system, somehow, somewhere.
*Now, why do I mention all this in the development list? I've been intrigued by one little ray of hope in their words: "if a client connection does not receive a response from the server after approximately 30 to 60 seconds the load balancer will close the connection and the client will immediately receive the error message". Their techs said if I were able to emit any kind of intermediary response to the client during rendering of the page, then this would be solved. * Indeed, a bit like the Batch API works in Drupal (with that I often run night-long scripts without problems). I wonder, maybe this is a more generic problem for any system that employs load balancers?
*So my questions to you, colleagues, is -- do you see any place in Drupal processing chain that could be used, and approximately how, to make sure that the load balancer keeps the connection opened.* If you have any ideas, wild or proven, I will be happy to test and develop them further and bring them back to the community, of course. If this succeeds, I think many of us will be relieved (and able to focus on development again!)
Thank you for any ideas - on and off this list.
Best regards,
Tomáš / Vacilando
Anyone who buys hosting without shell access is gonna get what they pay for... However, I do want to draw attention to the fact that the CloudSITES services is being discussed here. I have been using the Rackspace CloudSERVERS (similar to Amazon EC2, in concept) offering for a few months now, and I love it. It has not been plagued by the recent security issues, or any of the complaints I head about CloudSites/Mosso. My only problem has been a corrupted backup on one occasion, so as always, never trust someone else's backup of your data. That applies to any hosting service, so I highly recommend Rackspace CloudServers. All the Best, Matt Chapman Ninjitsu Web Development -- The contents of this message should be assumed to be Confidential, and may not be disclosed without permission of the sender. On Thu, Feb 18, 2010 at 8:38 AM, Tomáš Fülöpp (vacilando.org) <tomi@vacilando.org> wrote:
(Interesting, Brian; I also were promised shell pretty soon about a year ago. It's a shame - MediaTemple has shell and also a breakdown of compute cycles per script...)
Anyway -- Victor's note about shortening PHP timeout brought me to thinking about measuring the time since the start of the execution and issuing flush() each time the process might time out.
Two questions:
what is the most suitable Drupal function for this -- it needs to be something that runs regularly and for all kind of pages for Drupal, is it enough to issue flush() or is ob_end_flush() also needed, or something else
Thanks a million for any ideas;
Tomáš / Vacilando
On Thu, Feb 18, 2010 at 15:46, Brian Vuyk <brian@brianvuyk.com> wrote:
I've run into this with a few of my client sites, but they haven't even been high-traffic sites.
Personally, I just don't think the RS Cloud is a good match for Drupal. Combine that with the recent security issues they've had, occasional inexplicable downtime, the 'no suitable nodes' and the lack of a shell, and I am moving my sites away as quick as I can.
The shell issue is really sensitive for me - about 14 months ago, my previous host ran into... issues... and could no longer offer hosting. So, I was in a pinch and Rackspace (then Mosso) looked very good apart from the lack of a shell. I talked to their customer service reps, and was informed that shell access for the cloud was in pre-release testing, and was scheduled to go live the next week.
In a burst of poor judgement, I decided that the package they offered was good enough to do without shell access for a week, so I bought in, and transferred my sites. 14 months later, shell access still hasn't been released, and I've had to move all my more critical / development-intensive sites off of their service in the meantime.
Brian
Tomáš Fülöpp (vacilando.org) wrote:
Hi,
At RackspaceCloud (former Mosso) I've been plagued with a very unfortunate problem that i crippling both my work and the work of my clients -- namely the infamous error message "Unfortunately there were no suitable nodes available to serve this request." Those of you at RS Cloud must have bumped into it. It is cryptic and happens unpredictably. The cloud is very stable and scalable, but for any a little bit heavier Drupal installation people do start getting these errors.
Basically, it is a generic error thrown by load balanced systems that occurs as a result of a script exceeding a maximum timeout value (not the PHP timeout value!) If a client connection does not receive a response from the server after approximately 30 to 60 seconds the load balancer will close the connection and the client will immediately receive the error message. In most cases, the script will continue to execute until it reaches completion, throws an error, or times out on the server, but the client will not see the page load as expected and will instead receive this error.
I've used Boost for anonymous pages, Parallel, Memcache, etc., all of which helped and anonymous users usually don't get this error. The problem is with admin or any other a bit heavier work of logged in users. Even for basic Drupal websites with not too many modules! Pages like the list of modules, or the status page, i.e. heavy database or file requests, or API calls in PHP, are very likely to time out.
Over the past year I've had a number of discussions with techs and admins at that cloud, but the situation is unresolved. They recognize the problem but maintain this is due to the special/unusual setup they use for their cloud. It is not a problem for some other CMS / frameworks. E.g. a very heavy MediaWiki installation runs just fine. Drupal seems to be less compatible with their system, somehow, somewhere.
Now, why do I mention all this in the development list? I've been intrigued by one little ray of hope in their words: "if a client connection does not receive a response from the server after approximately 30 to 60 seconds the load balancer will close the connection and the client will immediately receive the error message". Their techs said if I were able to emit any kind of intermediary response to the client during rendering of the page, then this would be solved. Indeed, a bit like the Batch API works in Drupal (with that I often run night-long scripts without problems). I wonder, maybe this is a more generic problem for any system that employs load balancers?
So my questions to you, colleagues, is -- do you see any place in Drupal processing chain that could be used, and approximately how, to make sure that the load balancer keeps the connection opened. If you have any ideas, wild or proven, I will be happy to test and develop them further and bring them back to the community, of course. If this succeeds, I think many of us will be relieved (and able to focus on development again!)
Thank you for any ideas - on and off this list.
Best regards,
Tomáš / Vacilando
There was an excellent writeup about this set of issues here: http://mavergames.net/content/drupal-rackspace-clouds-cloud-sites-platform-s... I should mention that it's Rackspace Cloud Sites that is the problem - Rackspace Cloud Servers are quite good, aside from the fact that their network seems to get more than its share of DDOS attacks. There should be *no* password-controlled shell access to one of the servers though, as they're constantly under attack (as all wide-open VPSs on the internet are). I certainly agree that hosting without shell access is a non-starter in any environment, and for that reason never even experimented with Cloud Sites. -Randy On Thu, Feb 18, 2010 at 10:08 AM, Matt Chapman <matt@ninjitsuweb.com> wrote:
Anyone who buys hosting without shell access is gonna get what they pay for...
However, I do want to draw attention to the fact that the CloudSITES services is being discussed here. I have been using the Rackspace CloudSERVERS (similar to Amazon EC2, in concept) offering for a few months now, and I love it. It has not been plagued by the recent security issues, or any of the complaints I head about CloudSites/Mosso.
My only problem has been a corrupted backup on one occasion, so as always, never trust someone else's backup of your data. That applies to any hosting service, so I highly recommend Rackspace CloudServers.
All the Best,
Matt Chapman Ninjitsu Web Development
-- The contents of this message should be assumed to be Confidential, and may not be disclosed without permission of the sender.
On Thu, Feb 18, 2010 at 8:38 AM, Tomáš Fülöpp (vacilando.org) <tomi@vacilando.org> wrote:
(Interesting, Brian; I also were promised shell pretty soon about a year ago. It's a shame - MediaTemple has shell and also a breakdown of compute cycles per script...)
Anyway -- Victor's note about shortening PHP timeout brought me to thinking about measuring the time since the start of the execution and issuing flush() each time the process might time out.
Two questions:
what is the most suitable Drupal function for this -- it needs to be something that runs regularly and for all kind of pages for Drupal, is it enough to issue flush() or is ob_end_flush() also needed, or something else
Thanks a million for any ideas;
Tomáš / Vacilando
On Thu, Feb 18, 2010 at 15:46, Brian Vuyk <brian@brianvuyk.com> wrote:
I've run into this with a few of my client sites, but they haven't even been high-traffic sites.
Personally, I just don't think the RS Cloud is a good match for Drupal. Combine that with the recent security issues they've had, occasional inexplicable downtime, the 'no suitable nodes' and the lack of a shell,
and
I am moving my sites away as quick as I can.
The shell issue is really sensitive for me - about 14 months ago, my previous host ran into... issues... and could no longer offer hosting. So, I was in a pinch and Rackspace (then Mosso) looked very good apart from the lack of a shell. I talked to their customer service reps, and was informed that shell access for the cloud was in pre-release testing, and was scheduled to go live the next week.
In a burst of poor judgement, I decided that the package they offered was good enough to do without shell access for a week, so I bought in, and transferred my sites. 14 months later, shell access still hasn't been released, and I've had to move all my more critical / development-intensive sites off of their service in the meantime.
Brian
Tomáš Fülöpp (vacilando.org) wrote:
Hi,
At RackspaceCloud (former Mosso) I've been plagued with a very unfortunate problem that i crippling both my work and the work of my clients -- namely the infamous error message "Unfortunately there were no suitable nodes available to serve this request." Those of you at RS Cloud must have bumped into it. It is cryptic and happens unpredictably. The cloud is very stable and scalable, but for any a little bit heavier Drupal installation people do start getting these errors.
Basically, it is a generic error thrown by load balanced systems that occurs as a result of a script exceeding a maximum timeout value (not the PHP timeout value!) If a client connection does not receive a response from the server after approximately 30 to 60 seconds the load balancer will close the connection and the client will immediately receive the error message. In most cases, the script will continue to execute until it reaches completion, throws an error, or times out on the server, but the client will not see the page load as expected and will instead receive this error.
I've used Boost for anonymous pages, Parallel, Memcache, etc., all of which helped and anonymous users usually don't get this error. The problem is with admin or any other a bit heavier work of logged in users. Even for basic Drupal websites with not too many modules! Pages like the list of modules, or the status page, i.e. heavy database or file requests, or API calls in PHP, are very likely to time out.
Over the past year I've had a number of discussions with techs and admins at that cloud, but the situation is unresolved. They recognize the problem but maintain this is due to the special/unusual setup they use for their cloud. It is not a problem for some other CMS / frameworks. E.g. a very heavy MediaWiki installation runs just fine. Drupal seems to be less compatible with their system, somehow, somewhere.
Now, why do I mention all this in the development list? I've been intrigued by one little ray of hope in their words: "if a client connection does not receive a response from the server after approximately 30 to 60 seconds the load balancer will close the connection and the client will immediately receive the error message". Their techs said if I were able to emit any kind of intermediary response to the client during rendering of the page, then this would be solved. Indeed, a bit like the Batch API works in Drupal (with that I often run night-long scripts without problems). I wonder, maybe this is a more generic problem for any system that employs load balancers?
So my questions to you, colleagues, is -- do you see any place in Drupal processing chain that could be used, and approximately how, to make sure that the load balancer keeps the connection opened. If you have any ideas, wild or proven, I will be happy to test and develop them further and bring them back to the community, of course. If this succeeds, I think many of us will be relieved (and able to focus on development again!)
Thank you for any ideas - on and off this list.
Best regards,
Tomáš / Vacilando
-- Randy Fay Drupal Development, troubleshooting, and debugging randy@randyfay.com +1 970.462.7450
Drupal by design doesn't generate output of any kind until the last second, and then sends the entire page as one giant string. That is what allows us to do all sorts of fun things in the theme layer or HTTP redirection before content gets sent. That said, if I understood the original message Rackspace is saying the proxy server is timing out after 30 *seconds* of no response? Even the heaviest Drupal page shouldn't get anywhere near that time. 3-4 seconds for something other than selected admin pages is considered an eternity, at least for the PHP time. There's something else going on here besides Drupal not being the fastest PHP app out there... --Larry Garfield Tomáš Fülöpp (vacilando.org) wrote:
(Interesting, Brian; I also were promised shell pretty soon about a year ago. It's a shame - MediaTemple has shell /and /also a breakdown of compute cycles per script...)
Anyway -- Victor's note about shortening PHP timeout brought me to thinking about measuring the time since the start of the execution and issuing flush() each time the process might time out.
Two questions:
1. what is the most suitable Drupal function for this -- it needs to be something that runs regularly and for all kind of pages 2. for Drupal, is it enough to issue flush() or is ob_end_flush() also needed, or something else
Thanks a million for any ideas;
Tomáš / Vacilando
On Thu, Feb 18, 2010 at 15:46, Brian Vuyk <brian@brianvuyk.com <mailto:brian@brianvuyk.com>> wrote:
I've run into this with a few of my client sites, but they haven't even been high-traffic sites.
Personally, I just don't think the RS Cloud is a good match for Drupal. Combine that with the recent security issues they've had, occasional inexplicable downtime, the 'no suitable nodes' and the lack of a shell, and I am moving my sites away as quick as I can.
The shell issue is really sensitive for me - about 14 months ago, my previous host ran into... issues... and could no longer offer hosting. So, I was in a pinch and Rackspace (then Mosso) looked very good apart from the lack of a shell. I talked to their customer service reps, and was informed that shell access for the cloud was in pre-release testing, and was scheduled to go live the next week.
In a burst of poor judgement, I decided that the package they offered was good enough to do without shell access for a week, so I bought in, and transferred my sites. 14 months later, shell access still hasn't been released, and I've had to move all my more critical / development-intensive sites off of their service in the meantime.
Brian
Tomáš Fülöpp (vacilando.org <http://vacilando.org>) wrote:
Hi,
At RackspaceCloud (former Mosso) I've been plagued with a very unfortunate problem that i crippling both my work and the work of my clients -- namely the infamous error message "Unfortunately there were no suitable nodes available to serve this request." Those of you at RS Cloud must have bumped into it. It is cryptic and happens unpredictably. The cloud is very stable and scalable, but for any a little bit heavier Drupal installation people do start getting these errors.
*Basically, it is a generic error thrown by load balanced systems that occurs as a result of a script exceeding a maximum timeout value (not the PHP timeout value!) If a client connection does not receive a response from the server after approximately 30 to 60 seconds the load balancer will close the connection and the client will immediately receive the error message. In most cases, the script will continue to execute until it reaches completion, throws an error, or times out on the server, but the client will not see the page load as expected and will instead receive this error.*
I've used Boost for anonymous pages, Parallel, Memcache, etc., all of which helped and anonymous users /usually/ don't get this error. The problem is with admin or any other a bit heavier work of logged in users. Even for basic Drupal websites with not too many modules! Pages like the list of modules, or the status page, i.e. heavy database or file requests, or API calls in PHP, are very likely to time out.
Over the past year I've had a number of discussions with techs and admins at that cloud, but the situation is unresolved. They recognize the problem but maintain this is due to the special/unusual setup they use for their cloud. It is not a problem for some other CMS / frameworks. E.g. a very heavy MediaWiki installation runs just fine. Drupal seems to be less compatible with their system, somehow, somewhere.
*Now, why do I mention all this in the development list? I've been intrigued by one little ray of hope in their words: "if a client connection does not receive a response from the server after approximately 30 to 60 seconds the load balancer will close the connection and the client will immediately receive the error message". Their techs said if I were able to emit any kind of intermediary response to the client /during /rendering of the page, then this would be solved. * Indeed, a bit like the Batch API works in Drupal (with that I often run night-long scripts without problems). I wonder, maybe this is a more generic problem for any system that employs load balancers?
*So my questions to you, colleagues, is -- do you see any place in Drupal processing chain that could be used, and approximately how, to make sure that the load balancer keeps the connection opened.* If you have any ideas, wild or proven, I will be happy to test and develop them further and bring them back to the community, of course. If this succeeds, I think many of us will be relieved (and able to focus on development again!)
Thank you for any ideas - on and off this list.
Best regards,
Tomáš / Vacilando
I dunno....I've run Drupal on some really slow servers, and the modules page can take a LONG time to render. I guess it kind of depends on the hardware they're using for Mosso, but I do agree with the sentiment that it's probably not a timeout issue. @Tomáš: If I were in your shoes and an issue like this was unresolved after a year, I think I'd be strongly considering a new hosting provider. Slicehost is fantastic. EC2 is a pretty good choice too (you can just spin up the Mercury AMI and have a really sweet Drupal hosting setup). I hear good things about Linode too. ----- Cameron Eagans Owner, Black Storms Studios, LLC http://www.blackstormsstudios.com On Thu, Feb 18, 2010 at 12:23 PM, larry@garfieldtech.com <larry@garfieldtech.com> wrote:
Drupal by design doesn't generate output of any kind until the last second, and then sends the entire page as one giant string. That is what allows us to do all sorts of fun things in the theme layer or HTTP redirection before content gets sent.
That said, if I understood the original message Rackspace is saying the proxy server is timing out after 30 *seconds* of no response? Even the heaviest Drupal page shouldn't get anywhere near that time. 3-4 seconds for something other than selected admin pages is considered an eternity, at least for the PHP time. There's something else going on here besides Drupal not being the fastest PHP app out there...
--Larry Garfield
Tomáš Fülöpp (vacilando.org) wrote:
(Interesting, Brian; I also were promised shell pretty soon about a year ago. It's a shame - MediaTemple has shell /and /also a breakdown of compute cycles per script...)
Anyway -- Victor's note about shortening PHP timeout brought me to thinking about measuring the time since the start of the execution and issuing flush() each time the process might time out.
Two questions:
1. what is the most suitable Drupal function for this -- it needs to be something that runs regularly and for all kind of pages 2. for Drupal, is it enough to issue flush() or is ob_end_flush() also needed, or something else
Thanks a million for any ideas;
Tomáš / Vacilando
On Thu, Feb 18, 2010 at 15:46, Brian Vuyk <brian@brianvuyk.com <mailto:brian@brianvuyk.com>> wrote:
I've run into this with a few of my client sites, but they haven't even been high-traffic sites.
Personally, I just don't think the RS Cloud is a good match for Drupal. Combine that with the recent security issues they've had, occasional inexplicable downtime, the 'no suitable nodes' and the lack of a shell, and I am moving my sites away as quick as I can.
The shell issue is really sensitive for me - about 14 months ago, my previous host ran into... issues... and could no longer offer hosting. So, I was in a pinch and Rackspace (then Mosso) looked very good apart from the lack of a shell. I talked to their customer service reps, and was informed that shell access for the cloud was in pre-release testing, and was scheduled to go live the next week.
In a burst of poor judgement, I decided that the package they offered was good enough to do without shell access for a week, so I bought in, and transferred my sites. 14 months later, shell access still hasn't been released, and I've had to move all my more critical / development-intensive sites off of their service in the meantime.
Brian
Tomáš Fülöpp (vacilando.org <http://vacilando.org>) wrote:
Hi,
At RackspaceCloud (former Mosso) I've been plagued with a very unfortunate problem that i crippling both my work and the work of my clients -- namely the infamous error message "Unfortunately there were no suitable nodes available to serve this request." Those of you at RS Cloud must have bumped into it. It is cryptic and happens unpredictably. The cloud is very stable and scalable, but for any a little bit heavier Drupal installation people do start getting these errors.
*Basically, it is a generic error thrown by load balanced systems that occurs as a result of a script exceeding a maximum timeout value (not the PHP timeout value!) If a client connection does not receive a response from the server after approximately 30 to 60 seconds the load balancer will close the connection and the client will immediately receive the error message. In most cases, the script will continue to execute until it reaches completion, throws an error, or times out on the server, but the client will not see the page load as expected and will instead receive this error.*
I've used Boost for anonymous pages, Parallel, Memcache, etc., all of which helped and anonymous users /usually/ don't get this error. The problem is with admin or any other a bit heavier work of logged in users. Even for basic Drupal websites with not too many modules! Pages like the list of modules, or the status page, i.e. heavy database or file requests, or API calls in PHP, are very likely to time out.
Over the past year I've had a number of discussions with techs and admins at that cloud, but the situation is unresolved. They recognize the problem but maintain this is due to the special/unusual setup they use for their cloud. It is not a problem for some other CMS / frameworks. E.g. a very heavy MediaWiki installation runs just fine. Drupal seems to be less compatible with their system, somehow, somewhere.
*Now, why do I mention all this in the development list? I've been intrigued by one little ray of hope in their words: "if a client connection does not receive a response from the server after approximately 30 to 60 seconds the load balancer will close the connection and the client will immediately receive the error message". Their techs said if I were able to emit any kind of intermediary response to the client /during /rendering of the page, then this would be solved. * Indeed, a bit like the Batch API works in Drupal (with that I often run night-long scripts without problems). I wonder, maybe this is a more generic problem for any system that employs load balancers?
*So my questions to you, colleagues, is -- do you see any place in Drupal processing chain that could be used, and approximately how, to make sure that the load balancer keeps the connection opened.* If you have any ideas, wild or proven, I will be happy to test and develop them further and bring them back to the community, of course. If this succeeds, I think many of us will be relieved (and able to focus on development again!)
Thank you for any ideas - on and off this list.
Best regards,
Tomáš / Vacilando
Modules page...? ... Oh, is that the thing we used before drush...? ;-) All the Best, Matt Chapman Ninjitsu Web Development -- The contents of this message should be assumed to be Confidential, and may not be disclosed without permission of the sender. On Thu, Feb 18, 2010 at 11:30 AM, Cameron Eagans <cweagans@gmail.com> wrote:
I dunno....I've run Drupal on some really slow servers, and the modules page can take a LONG time to render. I guess it kind of depends on the hardware they're using for Mosso, but I do agree with the sentiment that it's probably not a timeout issue.
@Tomáš: If I were in your shoes and an issue like this was unresolved after a year, I think I'd be strongly considering a new hosting provider. Slicehost is fantastic. EC2 is a pretty good choice too (you can just spin up the Mercury AMI and have a really sweet Drupal hosting setup). I hear good things about Linode too. ----- Cameron Eagans Owner, Black Storms Studios, LLC http://www.blackstormsstudios.com
On Thu, Feb 18, 2010 at 12:23 PM, larry@garfieldtech.com <larry@garfieldtech.com> wrote:
Drupal by design doesn't generate output of any kind until the last second, and then sends the entire page as one giant string. That is what allows us to do all sorts of fun things in the theme layer or HTTP redirection before content gets sent.
That said, if I understood the original message Rackspace is saying the proxy server is timing out after 30 *seconds* of no response? Even the heaviest Drupal page shouldn't get anywhere near that time. 3-4 seconds for something other than selected admin pages is considered an eternity, at least for the PHP time. There's something else going on here besides Drupal not being the fastest PHP app out there...
--Larry Garfield
Tomáš Fülöpp (vacilando.org) wrote:
(Interesting, Brian; I also were promised shell pretty soon about a year ago. It's a shame - MediaTemple has shell /and /also a breakdown of compute cycles per script...)
Anyway -- Victor's note about shortening PHP timeout brought me to thinking about measuring the time since the start of the execution and issuing flush() each time the process might time out.
Two questions:
1. what is the most suitable Drupal function for this -- it needs to be something that runs regularly and for all kind of pages 2. for Drupal, is it enough to issue flush() or is ob_end_flush() also needed, or something else
Thanks a million for any ideas;
Tomáš / Vacilando
On Thu, Feb 18, 2010 at 15:46, Brian Vuyk <brian@brianvuyk.com <mailto:brian@brianvuyk.com>> wrote:
I've run into this with a few of my client sites, but they haven't even been high-traffic sites.
Personally, I just don't think the RS Cloud is a good match for Drupal. Combine that with the recent security issues they've had, occasional inexplicable downtime, the 'no suitable nodes' and the lack of a shell, and I am moving my sites away as quick as I can.
The shell issue is really sensitive for me - about 14 months ago, my previous host ran into... issues... and could no longer offer hosting. So, I was in a pinch and Rackspace (then Mosso) looked very good apart from the lack of a shell. I talked to their customer service reps, and was informed that shell access for the cloud was in pre-release testing, and was scheduled to go live the next week.
In a burst of poor judgement, I decided that the package they offered was good enough to do without shell access for a week, so I bought in, and transferred my sites. 14 months later, shell access still hasn't been released, and I've had to move all my more critical / development-intensive sites off of their service in the meantime.
Brian
Tomáš Fülöpp (vacilando.org <http://vacilando.org>) wrote:
Hi,
At RackspaceCloud (former Mosso) I've been plagued with a very unfortunate problem that i crippling both my work and the work of my clients -- namely the infamous error message "Unfortunately there were no suitable nodes available to serve this request." Those of you at RS Cloud must have bumped into it. It is cryptic and happens unpredictably. The cloud is very stable and scalable, but for any a little bit heavier Drupal installation people do start getting these errors.
*Basically, it is a generic error thrown by load balanced systems that occurs as a result of a script exceeding a maximum timeout value (not the PHP timeout value!) If a client connection does not receive a response from the server after approximately 30 to 60 seconds the load balancer will close the connection and the client will immediately receive the error message. In most cases, the script will continue to execute until it reaches completion, throws an error, or times out on the server, but the client will not see the page load as expected and will instead receive this error.*
I've used Boost for anonymous pages, Parallel, Memcache, etc., all of which helped and anonymous users /usually/ don't get this error. The problem is with admin or any other a bit heavier work of logged in users. Even for basic Drupal websites with not too many modules! Pages like the list of modules, or the status page, i.e. heavy database or file requests, or API calls in PHP, are very likely to time out.
Over the past year I've had a number of discussions with techs and admins at that cloud, but the situation is unresolved. They recognize the problem but maintain this is due to the special/unusual setup they use for their cloud. It is not a problem for some other CMS / frameworks. E.g. a very heavy MediaWiki installation runs just fine. Drupal seems to be less compatible with their system, somehow, somewhere.
*Now, why do I mention all this in the development list? I've been intrigued by one little ray of hope in their words: "if a client connection does not receive a response from the server after approximately 30 to 60 seconds the load balancer will close the connection and the client will immediately receive the error message". Their techs said if I were able to emit any kind of intermediary response to the client /during /rendering of the page, then this would be solved. * Indeed, a bit like the Batch API works in Drupal (with that I often run night-long scripts without problems). I wonder, maybe this is a more generic problem for any system that employs load balancers?
*So my questions to you, colleagues, is -- do you see any place in Drupal processing chain that could be used, and approximately how, to make sure that the load balancer keeps the connection opened.* If you have any ideas, wild or proven, I will be happy to test and develop them further and bring them back to the community, of course. If this succeeds, I think many of us will be relieved (and able to focus on development again!)
Thank you for any ideas - on and off this list.
Best regards,
Tomáš / Vacilando
Yeah...it's now deprecated, but new users who don't know any better like to use it for some reason =P ----- Cameron Eagans Owner, Black Storms Studios, LLC http://www.blackstormsstudios.com On Thu, Feb 18, 2010 at 12:38 PM, Matt Chapman <matt@ninjitsuweb.com> wrote:
Modules page...? ...
Oh, is that the thing we used before drush...? ;-)
All the Best,
Matt Chapman Ninjitsu Web Development
-- The contents of this message should be assumed to be Confidential, and may not be disclosed without permission of the sender.
On Thu, Feb 18, 2010 at 11:30 AM, Cameron Eagans <cweagans@gmail.com> wrote:
I dunno....I've run Drupal on some really slow servers, and the modules page can take a LONG time to render. I guess it kind of depends on the hardware they're using for Mosso, but I do agree with the sentiment that it's probably not a timeout issue.
@Tomáš: If I were in your shoes and an issue like this was unresolved after a year, I think I'd be strongly considering a new hosting provider. Slicehost is fantastic. EC2 is a pretty good choice too (you can just spin up the Mercury AMI and have a really sweet Drupal hosting setup). I hear good things about Linode too. ----- Cameron Eagans Owner, Black Storms Studios, LLC http://www.blackstormsstudios.com
On Thu, Feb 18, 2010 at 12:23 PM, larry@garfieldtech.com <larry@garfieldtech.com> wrote:
Drupal by design doesn't generate output of any kind until the last second, and then sends the entire page as one giant string. That is what allows us to do all sorts of fun things in the theme layer or HTTP redirection before content gets sent.
That said, if I understood the original message Rackspace is saying the proxy server is timing out after 30 *seconds* of no response? Even the heaviest Drupal page shouldn't get anywhere near that time. 3-4 seconds for something other than selected admin pages is considered an eternity, at least for the PHP time. There's something else going on here besides Drupal not being the fastest PHP app out there...
--Larry Garfield
Tomáš Fülöpp (vacilando.org) wrote:
(Interesting, Brian; I also were promised shell pretty soon about a year ago. It's a shame - MediaTemple has shell /and /also a breakdown of compute cycles per script...)
Anyway -- Victor's note about shortening PHP timeout brought me to thinking about measuring the time since the start of the execution and issuing flush() each time the process might time out.
Two questions:
1. what is the most suitable Drupal function for this -- it needs to be something that runs regularly and for all kind of pages 2. for Drupal, is it enough to issue flush() or is ob_end_flush() also needed, or something else
Thanks a million for any ideas;
Tomáš / Vacilando
On Thu, Feb 18, 2010 at 15:46, Brian Vuyk <brian@brianvuyk.com <mailto:brian@brianvuyk.com>> wrote:
I've run into this with a few of my client sites, but they haven't even been high-traffic sites.
Personally, I just don't think the RS Cloud is a good match for Drupal. Combine that with the recent security issues they've had, occasional inexplicable downtime, the 'no suitable nodes' and the lack of a shell, and I am moving my sites away as quick as I can.
The shell issue is really sensitive for me - about 14 months ago, my previous host ran into... issues... and could no longer offer hosting. So, I was in a pinch and Rackspace (then Mosso) looked very good apart from the lack of a shell. I talked to their customer service reps, and was informed that shell access for the cloud was in pre-release testing, and was scheduled to go live the next week.
In a burst of poor judgement, I decided that the package they offered was good enough to do without shell access for a week, so I bought in, and transferred my sites. 14 months later, shell access still hasn't been released, and I've had to move all my more critical / development-intensive sites off of their service in the meantime.
Brian
Tomáš Fülöpp (vacilando.org <http://vacilando.org>) wrote:
Hi,
At RackspaceCloud (former Mosso) I've been plagued with a very unfortunate problem that i crippling both my work and the work of my clients -- namely the infamous error message "Unfortunately there were no suitable nodes available to serve this request." Those of you at RS Cloud must have bumped into it. It is cryptic and happens unpredictably. The cloud is very stable and scalable, but for any a little bit heavier Drupal installation people do start getting these errors.
*Basically, it is a generic error thrown by load balanced systems that occurs as a result of a script exceeding a maximum timeout value (not the PHP timeout value!) If a client connection does not receive a response from the server after approximately 30 to 60 seconds the load balancer will close the connection and the client will immediately receive the error message. In most cases, the script will continue to execute until it reaches completion, throws an error, or times out on the server, but the client will not see the page load as expected and will instead receive this error.*
I've used Boost for anonymous pages, Parallel, Memcache, etc., all of which helped and anonymous users /usually/ don't get this error. The problem is with admin or any other a bit heavier work of logged in users. Even for basic Drupal websites with not too many modules! Pages like the list of modules, or the status page, i.e. heavy database or file requests, or API calls in PHP, are very likely to time out.
Over the past year I've had a number of discussions with techs and admins at that cloud, but the situation is unresolved. They recognize the problem but maintain this is due to the special/unusual setup they use for their cloud. It is not a problem for some other CMS / frameworks. E.g. a very heavy MediaWiki installation runs just fine. Drupal seems to be less compatible with their system, somehow, somewhere.
*Now, why do I mention all this in the development list? I've been intrigued by one little ray of hope in their words: "if a client connection does not receive a response from the server after approximately 30 to 60 seconds the load balancer will close the connection and the client will immediately receive the error message". Their techs said if I were able to emit any kind of intermediary response to the client /during /rendering of the page, then this would be solved. * Indeed, a bit like the Batch API works in Drupal (with that I often run night-long scripts without problems). I wonder, maybe this is a more generic problem for any system that employs load balancers?
*So my questions to you, colleagues, is -- do you see any place in Drupal processing chain that could be used, and approximately how, to make sure that the load balancer keeps the connection opened.* If you have any ideas, wild or proven, I will be happy to test and develop them further and bring them back to the community, of course. If this succeeds, I think many of us will be relieved (and able to focus on development again!)
Thank you for any ideas - on and off this list.
Best regards,
Tomáš / Vacilando
*Some* progress - but it still may be a dead end: mikeytown2 suggested me to try to echo a comment in index.php. I instead used flush(); there. I cleared all caches and went to a heavy page (modules) and ... it loaded for over a minute, then ended up *without* the "no suitable nodes" message! The catch -- the resulting page was blank. I looked in logs and obviously that was full of "PHP Warning: Cannot modify header information - headers already sent in ...". So the last question is - is there a way to reset the headers after flush();... somehow tell the browser, while it is still waiting, that it should forget about the initial flush, now come the real headers and the page. Alternatively, is there a possibility to output the headers earlier in the process, as soon as possible, and then put this flush(); after that...? That would be the solution!! Inclined to hope again... Let me know Tomáš / Vacilando On Thu, Feb 18, 2010 at 23:01, Cameron Eagans <cweagans@gmail.com> wrote:
Yeah...it's now deprecated, but new users who don't know any better like to use it for some reason =P ----- Cameron Eagans Owner, Black Storms Studios, LLC http://www.blackstormsstudios.com
On Thu, Feb 18, 2010 at 12:38 PM, Matt Chapman <matt@ninjitsuweb.com> wrote:
Modules page...? ...
Oh, is that the thing we used before drush...? ;-)
All the Best,
Matt Chapman Ninjitsu Web Development
-- The contents of this message should be assumed to be Confidential, and may not be disclosed without permission of the sender.
On Thu, Feb 18, 2010 at 11:30 AM, Cameron Eagans <cweagans@gmail.com> wrote:
I dunno....I've run Drupal on some really slow servers, and the modules page can take a LONG time to render. I guess it kind of depends on the hardware they're using for Mosso, but I do agree with the sentiment that it's probably not a timeout issue.
@Tomáš: If I were in your shoes and an issue like this was unresolved after a year, I think I'd be strongly considering a new hosting provider. Slicehost is fantastic. EC2 is a pretty good choice too (you can just spin up the Mercury AMI and have a really sweet Drupal hosting setup). I hear good things about Linode too. ----- Cameron Eagans Owner, Black Storms Studios, LLC http://www.blackstormsstudios.com
On Thu, Feb 18, 2010 at 12:23 PM, larry@garfieldtech.com <larry@garfieldtech.com> wrote:
Drupal by design doesn't generate output of any kind until the last
second, and then sends the entire page as one giant string. That is what allows us to do all sorts of fun things in the theme layer or HTTP redirection before content gets sent.
That said, if I understood the original message Rackspace is saying the
proxy server is timing out after 30 *seconds* of no response? Even the heaviest Drupal page shouldn't get anywhere near that time. 3-4 seconds for something other than selected admin pages is considered an eternity, at least for the PHP time. There's something else going on here besides Drupal not being the fastest PHP app out there...
--Larry Garfield
Tomáš Fülöpp (vacilando.org) wrote:
(Interesting, Brian; I also were promised shell pretty soon about a
year ago. It's a shame - MediaTemple has shell /and /also a breakdown of compute cycles per script...)
Anyway -- Victor's note about shortening PHP timeout brought me to
thinking about measuring the time since the start of the execution and issuing flush() each time the process might time out.
Two questions:
1. what is the most suitable Drupal function for this -- it needs to be something that runs regularly and for all kind of pages 2. for Drupal, is it enough to issue flush() or is ob_end_flush() also needed, or something else
Thanks a million for any ideas;
Tomáš / Vacilando
On Thu, Feb 18, 2010 at 15:46, Brian Vuyk <brian@brianvuyk.com<mailto:
brian@brianvuyk.com>> wrote:
I've run into this with a few of my client sites, but they haven't even been high-traffic sites.
Personally, I just don't think the RS Cloud is a good match for Drupal. Combine that with the recent security issues they've had, occasional inexplicable downtime, the 'no suitable nodes' and the lack of a shell, and I am moving my sites away as quick as I can.
The shell issue is really sensitive for me - about 14 months ago,
my
previous host ran into... issues... and could no longer offer hosting. So, I was in a pinch and Rackspace (then Mosso) looked very good apart from the lack of a shell. I talked to their customer service reps, and was informed that shell access for the cloud was in pre-release testing, and was scheduled to go live the next week.
In a burst of poor judgement, I decided that the package they offered was good enough to do without shell access for a week, so I bought in, and transferred my sites. 14 months later, shell access still hasn't been released, and I've had to move all my more critical / development-intensive sites off of their service in the meantime.
Brian
Tomáš Fülöpp (vacilando.org <http://vacilando.org>) wrote:
Hi,
At RackspaceCloud (former Mosso) I've been plagued with a very unfortunate problem that i crippling both my work and the work of my clients -- namely the infamous error message "Unfortunately there were no suitable nodes available to serve this request." Those of you at RS Cloud must have bumped into it. It is cryptic and happens unpredictably. The cloud is very stable and scalable, but for any a little bit heavier Drupal installation people do start getting these errors.
*Basically, it is a generic error thrown by load balanced systems that occurs as a result of a script exceeding a maximum timeout value (not the PHP timeout value!) If a client connection does not receive a response from the server after approximately 30 to 60 seconds the load balancer will close the connection and the client will immediately receive the error message. In most cases, the script will continue to execute until it reaches completion, throws an error, or times out on the server, but the client will not see the page load as expected and will instead receive this error.*
I've used Boost for anonymous pages, Parallel, Memcache, etc., all of which helped and anonymous users /usually/ don't get this error. The problem is with admin or any other a bit heavier work of logged in users. Even for basic Drupal websites with not too many modules! Pages like the list of modules, or the status page, i.e. heavy database or file requests, or API calls in PHP, are very likely to time out.
Over the past year I've had a number of discussions with techs and admins at that cloud, but the situation is unresolved. They recognize the problem but maintain this is due to the special/unusual setup they use for their cloud. It is not a problem for some other CMS / frameworks. E.g. a very heavy MediaWiki installation runs just fine. Drupal seems to be less compatible with their system, somehow, somewhere.
*Now, why do I mention all this in the development list? I've been intrigued by one little ray of hope in their words: "if a client connection does not receive a response from the server after approximately 30 to 60 seconds the load balancer will close the connection and the client will immediately receive the error message". Their techs said if I were able to emit any kind of intermediary response to the client /during /rendering of the page, then this would be solved. * Indeed, a bit like the Batch API works in Drupal (with that I often run night-long scripts without problems). I wonder, maybe this is a more generic problem for any system that employs load balancers?
*So my questions to you, colleagues, is -- do you see any place in Drupal processing chain that could be used, and approximately how, to make sure that the load balancer keeps the connection opened.* If you have any ideas, wild or proven, I will be happy to test and develop them further and bring them back to the community, of course. If this succeeds, I think many of us will be relieved (and able to focus on development again!)
Thank you for any ideas - on and off this list.
Best regards,
Tomáš / Vacilando
participants (8)
-
Brian Vuyk -
Cameron Eagans -
Gerhard Killesreiter -
larry@garfieldtech.com -
Matt Chapman -
Randy Fay -
Tomáš Fülöpp (vacilando.org) -
Victor Kane