[support] How does boost determine what URLs to crawl?

Sat Feb 20 12:15:17 UTC 2010

Hi,

I've installed Boost for a site that I have that will have quite a few 
different URLs (in the millions) as there are going to be cross 
referencing of taxonomy terms for a classifieds type of site (i.e. I 
have a URL like business-finance/insurance/insurance-brokers and you can 
"cross reference" this on a region, i.e. vic/melbourne-city/richmond to 
get a URL 
business-finance/insurance/insurance-brokers/vic/melbourne-city/richmond).  
I am generating these URLs on the fly in a custom module as there is 500 
terms in one vocabulary and 30k in the other.  My question is will boost 
find this term with its crawler?  If I navigate to any of these manually 
it is getting cached and loading my page much quicker but running the 
crawler doesn't seem to cache anything or put anything in the 
boost_crawler table.  Is there a way to get Boost to find these URLs?  
They are all in a block on the home page as links (well the top levels 
are, i.e. business-finance/insurance/insurance-brokers and 
vic/melbourne-city/richmond are both on the home page and the cross 
reference is available if you click on either of those) but I suspect 
that Boost isn't seeing them.

Hope that makes sense.
Thanks,
Anthony.