[development] PHP 5 > aggregator.module rewrite to XML API?

Ashraf Amayreh mistknight at gmail.com
Tue Jun 19 17:12:45 UTC 2007


I'm not really sure about the argument to sanitize data. Can't we sanitize
it in a little less than 11 seconds? Also, isn't there a possibility the
user wants this HTML code to come in as HTML code rather than plain text?

I would guess that my module does lack many sanity checks, but at the same
time, I do assume that administrators should be responsible as to what feeds
they add to their sites.

By the way, any sanity gurus who would like to check on my module's sanity
checks and help me with additional sanity checks are very welcome and have
my full gratitude. Just drop me a line off-list.

On 6/19/07, Morbus Iff <morbus at disobey.com> wrote:
>
> > Unfortunately, we can't take these statistics as canon:
> >
> >   * there's no instructions on how to duplicate.
> >
> >   * the SimplePie result is an estimate ("At SimplePie I have to
> >     do an estimate, because the feed download time was accumulated
> >     to the measure."
> >
> >   * it is unknown whether the other feed parsers are doing the
> >     same sanitization that SimplePie does, again, which adds
> >     more time to the results.
>
> I have done some quick tests, using the same URL as Aron:
>
>   http://www.christiannewswire.com/rss/catfeed_2.xml
>
> I downloaded this file to my desktop. I will be passing this string into
> SimplePie instead of allowing SimplePie to download it. The file is 1M:
>
>   1027320 Jun 19 11:50 catfeed_2.xml
>
> This is the script I used with SimplePie 1.0 b3.2 (20061124):
>
>    <?php
>      $handle = fopen('./catfeed_2.xml', "r");
>      $contents = fread($handle, filesize('./catfeed_2.xml'));
>
>      require './simplepie.inc';
>      $feed = new SimplePie();
>      $feed->set_raw_data($contents);
>      $feed->init();
>      $parsed = $feed->get_items();
>    ?>
>
> With this command line:
>
>    ~/Desktop > date && php simplepie.php && date
>    Tue Jun 19 12:26:10 EDT 2007
>    Tue Jun 19 12:26:22 EDT 2007
>
> As you can see, this does confirm the 10 or 12 second parse time -- it
> is also using all the sanitation that SimplePie does by default.
> However, SimpleFeed and FeedParser both ship with the latest development
> version of SimplePie which includes an option to stop this sanitation:
>
>    $feed->set_stupidly_fast(TRUE);
>
> I grabbed today's development version, added the above
> line before the ->init() in the above script, and reran:
>
>    ~/Desktop > date && php simplepie.php && date
>    Tue Jun 19 12:28:54 EDT 2007
>    Tue Jun 19 12:28:55 EDT 2007
>
> You'll notice that it is only 1 second which removes all doubt in my
> mind that SimplePie is a bad thing comparitively (since one would assume
> you'd sanitize the data as necessary within Drupal).
>
> --
> Morbus Iff ( and think about the bad things that I didn't do )
> Technical: http://www.oreillynet.com/pub/au/779
> Culture: http://www.disobey.com/ and http://www.gamegrene.com/
> aim: akaMorbus / skype: morbusiff / icq: 2927491 / jabber.org: morbus
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.drupal.org/pipermail/development/attachments/20070619/ff83b946/attachment-0001.htm 


More information about the development mailing list