I'm not really sure about the argument to sanitize data. Can't we sanitize it in a little less than 11 seconds? Also, isn't there a possibility the user wants this HTML code to come in as HTML code rather than plain text? I would guess that my module does lack many sanity checks, but at the same time, I do assume that administrators should be responsible as to what feeds they add to their sites. By the way, any sanity gurus who would like to check on my module's sanity checks and help me with additional sanity checks are very welcome and have my full gratitude. Just drop me a line off-list. On 6/19/07, Morbus Iff <morbus@disobey.com> wrote:
Unfortunately, we can't take these statistics as canon:
* there's no instructions on how to duplicate.
* the SimplePie result is an estimate ("At SimplePie I have to do an estimate, because the feed download time was accumulated to the measure."
* it is unknown whether the other feed parsers are doing the same sanitization that SimplePie does, again, which adds more time to the results.
I have done some quick tests, using the same URL as Aron:
http://www.christiannewswire.com/rss/catfeed_2.xml
I downloaded this file to my desktop. I will be passing this string into SimplePie instead of allowing SimplePie to download it. The file is 1M:
1027320 Jun 19 11:50 catfeed_2.xml
This is the script I used with SimplePie 1.0 b3.2 (20061124):
<?php $handle = fopen('./catfeed_2.xml', "r"); $contents = fread($handle, filesize('./catfeed_2.xml'));
require './simplepie.inc'; $feed = new SimplePie(); $feed->set_raw_data($contents); $feed->init(); $parsed = $feed->get_items(); ?>
With this command line:
~/Desktop > date && php simplepie.php && date Tue Jun 19 12:26:10 EDT 2007 Tue Jun 19 12:26:22 EDT 2007
As you can see, this does confirm the 10 or 12 second parse time -- it is also using all the sanitation that SimplePie does by default. However, SimpleFeed and FeedParser both ship with the latest development version of SimplePie which includes an option to stop this sanitation:
$feed->set_stupidly_fast(TRUE);
I grabbed today's development version, added the above line before the ->init() in the above script, and reran:
~/Desktop > date && php simplepie.php && date Tue Jun 19 12:28:54 EDT 2007 Tue Jun 19 12:28:55 EDT 2007
You'll notice that it is only 1 second which removes all doubt in my mind that SimplePie is a bad thing comparitively (since one would assume you'd sanitize the data as necessary within Drupal).
-- Morbus Iff ( and think about the bad things that I didn't do ) Technical: http://www.oreillynet.com/pub/au/779 Culture: http://www.disobey.com/ and http://www.gamegrene.com/ aim: akaMorbus / skype: morbusiff / icq: 2927491 / jabber.org: morbus