[drupal-devel] Re: Database analysis of CivicSpace/Drupal

Kieran Lal kieran at civicspacelabs.org
Sat Apr 16 02:57:03 UTC 2005


I'd be happy to debate anyone in the community on the merits of  
relational databases versus other data stores for applications.

It's not just the relational structure but the overwhelming  
infrastructure of developers, design tools, data reporting  
capabilities, and widely available expertise that overwhelms other data  
management methods.

The evolution from hierarchical, to relational, to network  
databases(Object, XML) has been run through multiple times over the  
last 4 decades.  Every time relational wins hands down.

Cheers,
Kieran

On Apr 15, 2005, at 7:39 PM, Anselm Hook wrote:

>
> I had the pleasure of working with an rdf based datastore on another
> project and one of the things that strikes me about the difference  
> between
> relational databases and rdf datastores is the surprising elegance of  
> rdf
> in terms of back end queries for a developer.
>
> In an rdf datastore such as redland there is only one 'table' in a  
> sense -
> or best thought of as just a list of subjects.  You might have subjects
> that describe posts, stories, people, other objects - but these do not
> require separate tables.  You can decorate any subject (or 'key') with  
> any
> field - and all fields are first class citizens - wheras in drupal many
> fields are tacked into a kind of varchar bucket.  Reconstituting an  
> object
> in drupal involves calling methods that know about that object.  In
> redland you just ask for all the values with the given subject key.
>
> Just a comment in any case; not meant to be a troll that incites  
> lengthy
> aimless discussion... :-) Feel free to mail me directly though if you
> think this is a horrible generalization on my part - since I do not  
> want
> to bog the list with what is effectively a troll.  I agree it is
> obviously not possible to change architectures at this low a level in
> drupal - although I do note that IndyVoter is going the RDF route.
>
>  - a
>
>
> On Fri, 15 Apr 2005, Kieran Lal wrote:
>
>> How about that 68 fields involved in JOINs are not indexed?
>>
>> So every time there is a join operation the database has to do search
>> through every record on that field.  Of course, if I had just waited
>> until people thought it was a problem we wouldn't know it was a  
>> problem
>> ;-)
>>
>> Kieran
>> On Apr 15, 2005, at 6:36 PM, Dan Robinson wrote:
>>
>>> what problem are we trying to solve here?
>>>
>>> Dan
>>>
>>>> There are currently 107 database tables in the latest release of
>>>> CivicSpace. I believe there are 55 in Drupal core.  One of the most
>>>> important things you can do in improving database performance is to
>>>> make sure you have the right indicies for your database. Currently  
>>>> we
>>>> do not use foriegn keys to enforce relationships between database
>>>> tables.  I did an analysis of the CivicSpace code base and found  
>>>> that
>>>> there was 156 SQL statements with at least one JOIN. We can consider
>>>> these as application level relationships between tables. Here is a
>>>> visual representation of those JOINs:
>>>> http://civicspacelabs.org/home/files/schema.png I also analysed the
>>>> indicies of the current database schema. There are 182 indicies.
>>>>
>>>> The question was do they match? The answer was no.
>>>>
>>>> There are at least 68 fields that are joined which do not have
>>>> indicies from my initial analysis. Here are the fields in my blog
>>>> post:
>>>> http://civicspacelabs.org/home/node/12245
>>>>
>>>> The scientific method teaches us that results are useless, unless  
>>>> you
>>>> have a repeatable method.  I am going to explain in this book(
>>>> http://civicspacelabs.org/home/node/12246 ), Drupal-Docs folks, how  
>>>> I
>>>> conducted this analysis and it needs to be repeated and verified
>>>> before being acted upon. I built several tools with help from Drumm,
>>>> Ankur, Tom Eliaz, and Chris Johnson(cxj). It's possible those tools
>>>> ran amok.
>>>>
>>>> This is the first step in what I hope will be a continued analysis  
>>>> of
>>>> the database layer in Drupal. The next step will be to extend
>>>> devel.module to do batch updates of page executions times and mysql
>>>> query times. I have several theories about how to do this properly
>>>> and would appreciate your input. The final stage will be to create a
>>>> documented database schema that allows for external application
>>>> integration with common labels. If we want to integrate with
>>>> non-drupal applications we will have to allow for semantic mapping  
>>>> of
>>>> our database schema to a common database bus architecture(See Ralph
>>>> Kimball http://www.rkimball.com/).  But I'lll leave that for another
>>>> discussion.
>>>>
>>>> Cheers,
>>>> Kieran
>>>>
>>>> P.S. I am off to go mountain biking, hiking, and kayaking in Hawaii
>>>> for a week so don't expect too many responses from me.
>>>>
>>>>
>>>> -------------------------------------------------------------------- 
>>>> -
>>>> To unsubscribe, e-mail:  
>>>> civicspace-dev-unsubscribe at civicspacelabs.org
>>>> For additional commands, e-mail:
>>>> civicspace-dev-help at civicspacelabs.org
>>>>
>>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: civicspace-dev-unsubscribe at civicspacelabs.org
>> For additional commands, e-mail:  
>> civicspace-dev-help at civicspacelabs.org
>>
>>
>
>




More information about the drupal-devel mailing list