[development] Performance Testing with a Big Test Database

Larry Garfield larry at garfieldtech.com
Thu May 15 15:03:42 UTC 2008


What are the disk space requirements for Xapian?  At least in my experience, the giant size of the index is more of an issue than runtime.  The indexes are easily larger than the content in question, by a factor of four.  (I'm about to disable search on one site to avoid getting the web host mad at me for database size.)

--Larry Garfield

On Thu, 15 May 2008 09:07:07 -0400, Doug Green <douggreen at douggreenconsulting.com> wrote:
> AFAIKT, Xapian replaces the indexing.  I looked at the code when because
> of this post to the devel list.  To use Xapian, you have to patch core.
> We'd like to make this sort of thing easier.  We discussed it some at
> the sprint.  I believe that Earnest Berry started refactoring code into
> a search module, an indexing module and a UI module where the indexing
> or UI module's could be replaced.
> 
> We need more indexing performance tuning, but on the plane ride home I
> came up with 3 small improvements (257910, 257912, 257916)
> 
> Earnie Boyd wrote:
>> Quoting Simon Lindsay <simon at iseek.biz>:
>>
>>> Doug Green wrote:
>>>> One product of the search sprint is this large database for testing...
>>>> http://civicactions.s3.amazonaws.com/drupal6_100k.mysql.gz
>>>
>>> Hello Doug,
>>>
>>> You may not have seen it, but Trellon recently sponsored furthering
>>> some development which we did with the Xapian search engine, and
>>> integrating it in to Drupal.
>>>
>>> http://drupal.org/project/xapian
>>>
>>> Michael has done some preliminary performance testing, with almost
>>> 100,000 records created with devel module, and posted the results here.
>>>
>>> http://www.trellon.com/blog/xapian-search-drupal
>>>
>>> Perhaps this may also be of interest for people looking into the
>>> drupal search engine.
>>>
>>
>> Is this xapian module creating the search indexes as well, or just the
>> UI functional pieces?  While the UI is important, the actual parsing
>> of the node data to search is in need of some performance tuning as
> well.
>>
>> Earnie -- http://for-my-kids.com/
>> -- http://give-me-an-offer.com/
>>
>>
> 
> 
> --
> Doug Green
> douggreen at douggreenconsulting.com
> 904-583-3342
> 
> Bringing Ideas to Life with Software Artistry and Invention...



More information about the development mailing list