[drupal-devel] Re: Database analysis of CivicSpace/Drupal

Craig Courtney craig at dawnsedge.com
Sat Apr 16 02:34:58 UTC 2005


Dan has a very good point.  Make sure that this is a real problem 
before changing it.  Mysql makes some weird selections on indexes when 
joins are involved.  Creating an index just because you join on that 
field is not a good move.  Make liberal use of explain before and after 
your index creates to make sure it even gets used.  Also take into 
account expected table sizes.

Craig

On Apr 15, 2005, at 10:28 PM, Kieran Lal wrote:

> How about that 68 fields involved in JOINs are not indexed?
>
> So every time there is a join operation the database has to do search 
> through every record on that field.  Of course, if I had just waited 
> until people thought it was a problem we wouldn't know it was a 
> problem ;-)
>
> Kieran
> On Apr 15, 2005, at 6:36 PM, Dan Robinson wrote:
>
>> what problem are we trying to solve here?
>>
>> Dan
>>
>>> There are currently 107 database tables in the latest release of 
>>> CivicSpace. I believe there are 55 in Drupal core.  One of the most 
>>> important things you can do in improving database performance is to 
>>> make sure you have the right indicies for your database. Currently 
>>> we do not use foriegn keys to enforce relationships between database 
>>> tables.  I did an analysis of the CivicSpace code base and found 
>>> that there was 156 SQL statements with at least one JOIN. We can 
>>> consider these as application level relationships between tables. 
>>> Here is a visual representation of those JOINs: 
>>> http://civicspacelabs.org/home/files/schema.png I also analysed the 
>>> indicies of the current database schema. There are 182 indicies.
>>>
>>> The question was do they match? The answer was no.
>>>
>>> There are at least 68 fields that are joined which do not have 
>>> indicies from my initial analysis. Here are the fields in my blog 
>>> post:
>>> http://civicspacelabs.org/home/node/12245
>>>
>>> The scientific method teaches us that results are useless, unless 
>>> you have a repeatable method.  I am going to explain in this book( 
>>> http://civicspacelabs.org/home/node/12246 ), Drupal-Docs folks, how 
>>> I conducted this analysis and it needs to be repeated and verified 
>>> before being acted upon. I built several tools with help from Drumm, 
>>> Ankur, Tom Eliaz, and Chris Johnson(cxj). It's possible those tools 
>>> ran amok.
>>>
>>> This is the first step in what I hope will be a continued analysis 
>>> of the database layer in Drupal. The next step will be to extend 
>>> devel.module to do batch updates of page executions times and mysql 
>>> query times. I have several theories about how to do this properly 
>>> and would appreciate your input. The final stage will be to create a 
>>> documented database schema that allows for external application 
>>> integration with common labels. If we want to integrate with 
>>> non-drupal applications we will have to allow for semantic mapping 
>>> of our database schema to a common database bus architecture(See 
>>> Ralph Kimball http://www.rkimball.com/).  But I'lll leave that for 
>>> another discussion.
>>>
>>> Cheers,
>>> Kieran
>>>
>>> P.S. I am off to go mountain biking, hiking, and kayaking in Hawaii 
>>> for a week so don't expect too many responses from me.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: civicspace-dev-unsubscribe at civicspacelabs.org
>>> For additional commands, e-mail: 
>>> civicspace-dev-help at civicspacelabs.org
>>>
>>>
>>
>




More information about the drupal-devel mailing list