[drupal-devel] Re: Database analysis of CivicSpace/Drupal

Kieran Lal kieran at civicspacelabs.org
Sat Apr 16 03:11:59 UTC 2005

Craig, in my first round of analysis I actually outputted record size  
along with indices.  It's certainly important I agree, but each  
instance of Drupal will have different usage patterns and so we need to  
do some high level analysis of the database schema first.

For example, on CivicSpaceLabs.org the biggest table is the search  
table.  It has 500 000 records.  Compare this to the node table with 11  
000 records.  However, that doesn't mean you shouldn't a index table  
that has only 5 000 records.  If it has JOINs you should index until  
you can disapprove the need for it.  Usually, that happens when you are  
doing a lot of updates, and the indexing overhead is too punishing.

I did recommend that this method be repeated on multiple Drupal  
application instances.  Once we have a good grasp on the database  
design, we should do real world performance analysis.  Unfortunately,  
we don't control the hardware platform and it's unlikely we will enough  
performance analysis to check across multiple platforms to be sure we  
have the best general solution.   Making changes to the Devel.module is  
good step towards this.


On Apr 15, 2005, at 7:25 PM, Craig Courtney wrote:

> Dan has a very good point.  Make sure that this is a real problem  
> before changing it.  Mysql makes some weird selections on indexes when  
> joins are involved.  Creating an index just because you join on that  
> field is not a good move.  Make liberal use of explain before and  
> after your index creates to make sure it even gets used.  Also take  
> into account expected table sizes.
> Craig
> On Apr 15, 2005, at 10:28 PM, Kieran Lal wrote:
>> How about that 68 fields involved in JOINs are not indexed?
>> So every time there is a join operation the database has to do search  
>> through every record on that field.  Of course, if I had just waited  
>> until people thought it was a problem we wouldn't know it was a  
>> problem ;-)
>> Kieran
>> On Apr 15, 2005, at 6:36 PM, Dan Robinson wrote:
>>> what problem are we trying to solve here?
>>> Dan
>>>> There are currently 107 database tables in the latest release of  
>>>> CivicSpace. I believe there are 55 in Drupal core.  One of the most  
>>>> important things you can do in improving database performance is to  
>>>> make sure you have the right indicies for your database. Currently  
>>>> we do not use foriegn keys to enforce relationships between  
>>>> database tables.  I did an analysis of the CivicSpace code base and  
>>>> found that there was 156 SQL statements with at least one JOIN. We  
>>>> can consider these as application level relationships between  
>>>> tables. Here is a visual representation of those JOINs:  
>>>> http://civicspacelabs.org/home/files/schema.png I also analysed the  
>>>> indicies of the current database schema. There are 182 indicies.
>>>> The question was do they match? The answer was no.
>>>> There are at least 68 fields that are joined which do not have  
>>>> indicies from my initial analysis. Here are the fields in my blog  
>>>> post:
>>>> http://civicspacelabs.org/home/node/12245
>>>> The scientific method teaches us that results are useless, unless  
>>>> you have a repeatable method.  I am going to explain in this book(  
>>>> http://civicspacelabs.org/home/node/12246 ), Drupal-Docs folks, how  
>>>> I conducted this analysis and it needs to be repeated and verified  
>>>> before being acted upon. I built several tools with help from  
>>>> Drumm, Ankur, Tom Eliaz, and Chris Johnson(cxj). It's possible  
>>>> those tools ran amok.
>>>> This is the first step in what I hope will be a continued analysis  
>>>> of the database layer in Drupal. The next step will be to extend  
>>>> devel.module to do batch updates of page executions times and mysql  
>>>> query times. I have several theories about how to do this properly  
>>>> and would appreciate your input. The final stage will be to create  
>>>> a documented database schema that allows for external application  
>>>> integration with common labels. If we want to integrate with  
>>>> non-drupal applications we will have to allow for semantic mapping  
>>>> of our database schema to a common database bus architecture(See  
>>>> Ralph Kimball http://www.rkimball.com/).  But I'lll leave that for  
>>>> another discussion.
>>>> Cheers,
>>>> Kieran
>>>> P.S. I am off to go mountain biking, hiking, and kayaking in Hawaii  
>>>> for a week so don't expect too many responses from me.
>>>> -------------------------------------------------------------------- 
>>>> -
>>>> To unsubscribe, e-mail:  
>>>> civicspace-dev-unsubscribe at civicspacelabs.org
>>>> For additional commands, e-mail:  
>>>> civicspace-dev-help at civicspacelabs.org

More information about the drupal-devel mailing list