On 9/22/06, Nat Catchpole natcatchpole@hotmail.com wrote:
This may not be drupal's fault, but I'd appreciate any advice you can offer with it.
Problems like this are almost always hardware related. Unfortunately, the only way to test this is to take the server out of production, run hardware tests on it (e.g. memtest, cpu maxing utilities, etc.) and then see if it falls over.
These problems can seem like they are caused by software performance because the box only falls over when the system is maxed out and performing badly - but a server should handle getting maxed out for days on end without problem.
If you can add some more monitoring (e.g. for temperature inside the box) and watch to see if there is a correlation there you may be able to diagnose the problem without taking it down. Then the solution is to do something to fix that problem (e.g. a bigger fan, different cabinet) and move on.
Regards, Greg