[development] PHP 5 > aggregator.module rewrite to XML API?

Ashraf Amayreh mistknight at gmail.com
Tue Jun 19 20:08:30 UTC 2007


Ahhh... so by sanitizing you mean accepting non-fully standards compliant
feeds? If that's what you mean then definitely not. I totally agree with
Larry on this. Why waste my processing time on feeds that are not worth a
penny?

Also, in my issue queue I haven't received one complaint about a feed that
would not parse. Not because I do any sanitization, but I think the reason
for this is because I parse the feed as an XML and look for the main
components, so even if it's not fully conforming it will make do if it has
the main components. If it's beyond hope in being called XML in the first
place or just totally messed up I don't waste the time and the coding effort
that might double or triple my module's code size on it. If that's a type of
sanitization then good for me but it definitely does not affect my module's
performance :-)

Finally, I don't really care to make my module work with everyone and
everything. That's why I have clear PHP 5 and CURL requirements. In my
opinion, if a person is not prepared to get a good environment for his site
or let's something as crappy as a host dictate his platform (let's drop
drupal because our host is using php 3.0 looool) then he's definitely not
from my target audience. PHP 5 sites have become as common as PHP 4 and very
near in price.

VPS hosting is becomming cheaper by the day for anyone who's serious about a
site. Take a look for yourselves. I use this VPS provider to sharpen my LAMP
skills since this provider provides a clean slate installation. I have root
access and I can do anything I want there. The comparison between HTML and
XML feeds is simply flawed IMHO.

http://www.vpslink.com/

On 6/19/07, Morbus Iff <morbus at disobey.com> wrote:
>
> > RSS is XML.  The XML spec explicitly says that invalid files should be
> discarded, not guessed at the way HTML is.  Trying to make sense of a broken
> RSS feed is explicitly contrary to the spec.  So, er, why are we spending so
> much time trying to sanitize?  If it doesn't parse correctly, report an
> error "this site's RSS feed is f*ed up, tell 'em to fix it".  Am I missing
> something here?
>
> Did you forget Postel's Law? Or the fact that for a feed to be
> considered "invalid" (as opposed to "not well-formed") would mean that
> Drupal would have to have a validating document type parser?
>
> http://www.w3.org/TR/REC-xml/#dt-valid
> http://www.w3.org/TR/REC-xml/#dt-wellformed
>
> And, honestly, telling people that their RSS is malformed and "pls fix,
> k thanks" is about as viable as telling someone that their HTML isn't
> well formed. It just ain't going to happen.
>
> --
> Morbus Iff ( be realistic. demand the impossible. )
> Technical: http://www.oreillynet.com/pub/au/779
> Culture: http://www.disobey.com/ and http://www.gamegrene.com/
> aim: akaMorbus / skype: morbusiff / icq: 2927491 / jabber.org: morbus
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.drupal.org/pipermail/development/attachments/20070619/2556f346/attachment.htm 


More information about the development mailing list