[development] Guidelines for writing efficient SQL code

David Metzler metzlerd at metzlerd.com
Wed Aug 26 14:25:05 UTC 2009


A good decision on wether to use multi-column indexes will take into  
account the cardinality of the data.  Sometimes a multi-column index  
is simply the right answer, particularly if the values in the first  
column of the  index do not limit the rows sufficiently.  Also  in  
where exists clauses the rdbms will not even load the row if the  
primary key is the composite index, but it will have to.  Indexes are  
best thought of like caches,  You use the right caching mechanism for  
the job, and you only do it when you can demonstrate performance  
improvement.

The mysql reference is true but not a matter of concern in a large  
number of cases when composite keys are being used.  Analyising  
complex queries is  a good idea, but hunting out and eliminating  
multi-column indexes doesn't seem helpful.

On Aug 26, 2009, at 6:40 AM, Jean-Michel Pouré wrote:

> Le mercredi 26 août 2009 à 08:30 -0400, Jamie Holly a écrit :
>> When writing queries they should be targeted for performance on all
>> supported RDBMS, yet the emphasis should be put on MySQL. The simple
>> fact that a vast (vast) majority of Drupal sites out there run MySQL
>> is
>> reason enough for that, and until there is some major cosmic shift in
>> what hosting companies offer that's going to remain the fact.
>
> Dear Jamie,
>
> I have been rewriting a generic guide for both MySQL and PostrgeSQL:
> Guidelines for writing efficient SQL code http://drupal.org/node/ 
> 559302
>
> These are very generic guidelines explaining why you should always log
> server-side queries, use Devel module and ANALYSE your queries.
>
> I also explain the notion of sequential scan and CPU time needed.
>
> Then I analyse simple facts using EXPLAIN.
>
>
>> Also on indexes you say "Therefore the proposed solution to enhance
>> Drupal is to hunt for multi-column indexes and replace them with
>> single
>> indexes.". MySQL makes very effecient use of multi column indexes. On
>> countless queries in Drupal changing to only single column indexes
>> would
>> result in full table scans, temporary tables and filesorts in
>> numerous
>> places.
>
> About dual-field indexing: PostgreSQL and MySQL work the same way.
>
> Some databases claim that a dual index works equally right and left.
> This is not portable and probably not true, as multi-column  
> indexing may
> result in a sequential scan in complex queries:
>
> Reference:
> According to
>
>     http://dev.mysql.com/doc/refman/5.1/en/mysql-indexes.html:
>
>         MySQL cannot use an index if the columns do not form a  
> leftmost
>         prefix of the index.
>
>     http://www.postgresql.org/docs/8.4/static/indexes-multicolumn.html
>
>         index could in principle be used for queries that have
>         constraints on b and/or c with no constraint on a — but the
>         entire index would have to be scanned, so in most cases the
>         planner would prefer a sequential table scan over using the
>         index.
>
> Therefore, when needed, it could be needed to add single indexes AND
> keep dual indexes.
>
>
>> Do LEFT JOINS outperform INNER JOINS? Yes
>> Do INNER JOINS outperform LEFT JOINS? Yes
>
> I don't believe so because of the nature of INNER JOINs which remove
> NULL rows and thus consume more CPU time. They only work faster on
> indexes on dual-fields when used properly.
>
> This needs to be studied using ANALYSE on very large SQL servers, not
> simple installs. Drupal website database probably is a good testing
> system.
>
> I hope that you can comment my code and improve the English and
> technical aspects.
>
> Kind regards,
> Jean-Michel
>
>
>



More information about the development mailing list