[development] aggregator2 update path complete!

Ashraf Amayreh mistknight at gmail.com
Tue May 1 20:46:34 UTC 2007


My module's architecture is already functioning as an API. Upon
installation, it creates a vocabulary that contains terms such as (RSS20,
RDF10, ATOM10). These terms are attached to the feed node type.  When a user
creates a feed, he has to specify one of these terms to go with the feed. He
can create extra terms (this is where the expandability comes in!).

Now that he created his feed. He needs to create a file associated with the
term he chose, the file, such as RSS20.inc will be passed a simpleXML object
and the feed object. My module handles reading the URL string and loading it
into a simpleXML object, then passing it to the feed handler, in this case,
RSS20.inc, the feed handler loops over the items and extracts specific
fields such as title, body, time, image_url. This is where the XML specific
extraction occurs, then it calls another (API?) function inside my module
called "add item" and passes all of the extracted data to it. Among the
passed data it can specify an unlimited number of (extracted) terms and even
new (extracted) vocabularies to group this item under. If the term already
exists, it will go under it, else, it will create it and go under it
(auto-taxonomy).

My API function can also be passed one image URL per item, automatically
extract it, insert it as an image node, and attach it to the item node.

What if we have a custom XML formatted URL to parse? Some XML format called
XYZ? We create a new term called XYZ, add it to my aggregation vocabulary,
add a file in the feed_handler sub folder called XYZ.inc, my module
retrieves the URL and loads it in a simpleXML object, passes it to XYZ.inc,
which does the special extraction, and passes the results to my "add item"
function which creates a node, and an image node if a URL was passed in by
XML.inc, and wala, that's how easy it is to expand it. Nil code duplication,
but I'm sure there are infinite ways to extend it.

This provides the ability to parse any new formats using what I hope is a
simple API. And is in need of everyone's help to make perfect. As a
beginning, we can start by removing the PHP5 specific code (or to provide
conditional alternatives to be precise?). My API/module is composed of the
following methods :

function aggregation_get_URL($url, $username = NULL, $password = NULL, $feed
= NULL,
    $feed_etag = NULL, $feed_last_modified = NULL)

function aggregation_get_XML($string, $feed = NULL)

function _aggregation_<term-name>_parse($feed_XML, $feed)
** (present in file <term_name>.inc) making it a clean add-on. I thought of
making this a new hook_aggregation_term_name or something??? Just some
thoughts... **

function _aggregation_add_item($title, $body, $teaser, $original_author,
$feed,
    $additional_taxonomies, $timestamp = NULL, $original_item_url = NULL,
$guid = NULL,
    $image_array = NULL);

What does the user who wishes to extend this module do? He adds one term and
one file that holds one function. The only code he writes is specific format
extraction code.

Now to one important last issue, I would be extremely grateful for a sponsor
to my module, weather it goes to core or not. I have 6 wonderful feature
requests that make my fingers tingle just thinking about implementing them.
But the times on my side, they are a changin', and not to the better I'm
afraid. Regardless of weather my module is sponsored or not, I'm telling
everyone who uses it that they should rest assured I will keep it in tip-top
shape. It's become so solid that I've not received a bug-report for nearly
two weeks, but rather 6 feature requests. Two of which are BIG. I've
received a number of thank yous which has made this well worth the effort.
If someone can help me by sponsoring it so I could actively continue
development I would be extremely grateful.

On 5/1/07, Boris Mann <boris at bryght.com> wrote:
>
> On 5/1/07, Dries Buytaert <dries.buytaert at gmail.com> wrote:
>
> > My aggregator module too ... ;-)  Personally, I think we need to fix
> > this in core but hey, we've been saying that for 2 years now, and so
> > far, I've seen very few aggregator module patches.  Boris did a good
> > job outlining some of the work it takes, it's just a matter of
> > implementing some of these proposals.  Anyone willing to work on
> > this?  We can do this one patch at a time so it not particularly
> > complicated or time-consuming ...
>
> I wanted to point out that Bill from Achieve Internet updated a bunch
> and there are three ready for review already, including fixing PHP
> notices, the return (!!) of "blog this", and entity / tag stripping of
> titles. See http://groups.drupal.org/node/3538 for a few more to look
> at.
>
> All I'm doing is some bug gardening on the issue queue: feel free to
> do some review of issues (I've got some favorites in there from '05!)
> and then summarizing on the wiki.
>
> Feel free to jump in, it's a good way to power level your core commit
> points :P
>
> --
> Boris Mann
> Office 604-682-2889
> Skype borismann
> http://www.bryght.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.drupal.org/pipermail/development/attachments/20070501/39b77f96/attachment.htm 


More information about the development mailing list