[development] Guidelines for writing efficient SQL code

Jean-Michel Pouré jm at poure.com
Wed Aug 26 13:40:26 UTC 2009


Le mercredi 26 août 2009 à 08:30 -0400, Jamie Holly a écrit :
> When writing queries they should be targeted for performance on all 
> supported RDBMS, yet the emphasis should be put on MySQL. The simple 
> fact that a vast (vast) majority of Drupal sites out there run MySQL
> is 
> reason enough for that, and until there is some major cosmic shift in 
> what hosting companies offer that's going to remain the fact.

Dear Jamie,

I have been rewriting a generic guide for both MySQL and PostrgeSQL:
Guidelines for writing efficient SQL code http://drupal.org/node/559302

These are very generic guidelines explaining why you should always log
server-side queries, use Devel module and ANALYSE your queries.

I also explain the notion of sequential scan and CPU time needed.

Then I analyse simple facts using EXPLAIN.


> Also on indexes you say "Therefore the proposed solution to enhance 
> Drupal is to hunt for multi-column indexes and replace them with
> single 
> indexes.". MySQL makes very effecient use of multi column indexes. On 
> countless queries in Drupal changing to only single column indexes
> would 
> result in full table scans, temporary tables and filesorts in
> numerous 
> places.

About dual-field indexing: PostgreSQL and MySQL work the same way. 

Some databases claim that a dual index works equally right and left.
This is not portable and probably not true, as multi-column indexing may
result in a sequential scan in complex queries:

Reference:
According to 

    http://dev.mysql.com/doc/refman/5.1/en/mysql-indexes.html:

        MySQL cannot use an index if the columns do not form a leftmost
        prefix of the index.
        
    http://www.postgresql.org/docs/8.4/static/indexes-multicolumn.html

        index could in principle be used for queries that have
        constraints on b and/or c with no constraint on a — but the
        entire index would have to be scanned, so in most cases the
        planner would prefer a sequential table scan over using the
        index.

Therefore, when needed, it could be needed to add single indexes AND
keep dual indexes. 


> Do LEFT JOINS outperform INNER JOINS? Yes
> Do INNER JOINS outperform LEFT JOINS? Yes

I don't believe so because of the nature of INNER JOINs which remove
NULL rows and thus consume more CPU time. They only work faster on
indexes on dual-fields when used properly.

This needs to be studied using ANALYSE on very large SQL servers, not
simple installs. Drupal website database probably is a good testing
system.

I hope that you can comment my code and improve the English and
technical aspects. 

Kind regards,
Jean-Michel



-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: Ceci est une partie de message num?riquement sign?e
URL: <http://lists.drupal.org/pipermail/development/attachments/20090826/26b6675c/attachment.pgp>


More information about the development mailing list