[development] Re: [drupal-devel] performance improvements - avoiding big unserialize()

Richard Archer drupal.org at juggernaut.com.au
Wed Nov 30 23:24:22 UTC 2005


At 12:21 PM -0500 30/11/05, Moshe Weitzman wrote:

>i don't understand this. you get no benefit by just storing the array
>directly? for example, apc_store('conf', $conf);

That's correct, my tests showed no performance gain from using
apc_store and apc_fetch.

Looking through the APC source it is easy to see why this is the
case. APC manipulates the array before storing it, creating an
overhead similar to serialize and unserialize. I wonder if the
APC authors wouldn't have saved themselves a lot of headaches by
just using the PHP serialize and unserialize functions?

While I haven't tested memcached, it is unlikely to be any better
because it DOES call the PHP serialize and unserialize functions!

I added a couple more tests to my script, and it seems unserialize
is actually a very efficient way of restoring an array. It is twice
as fast as building the array in a for loop. Which isn't surprising
since a for loop is interpreted and unserialize is optimized C code.

Building the array in a for loop:
time: 0.94807291030884 seconds elapsed for 100 iterations.
memory: 115920 bytes consumed.

Unserialize from an existing global variable (no I/O):
time: 0.45300388336182 seconds elapsed for 100 iterations.
memory: 76880 bytes consumed.

If you doubt my test results, please check out the script. It's
not particularly nice code, but I think the results are valid.
http://mel01.juggernaut.com.au/arraytest.php.gz

Perhaps someone would like to run this test script through a
profiler?


>maybe so. the fact remains that we spend significant time during every
>request unserializing these large arrays, and if we want to speed up
>drupal, we have to concentrate in this area.

Not necessarily. There are lots of areas in Drupal in which
performance improvement is possible.

The best way to improve performance is to process less data.
Smaller data means:
- more filesystem buffer hits
- more disk cache hits
- more database cache hits
- less memory moved to unserialize the data
- less memory moved to process the data.

In this case processing less data could involve:
- optimizing the size of the data stored in the arrays
- storing less data in the arrays
- splitting the arrays so only required portions are retrieved

And of course trying to serve more pages from the page cache.


>I should add that APC will be included as part of PHP6:

Well, if it's going to be included in core it might receive some
more love and attention. I certainly wouldn't leave APC in its
current state running on my production server.

 ...Richard.


More information about the development mailing list