[development] Data import and synchronization to nodes
earnie at users.sourceforge.net
Fri Sep 14 13:50:46 UTC 2007
Quoting Alexander Barth <alex at developmentseed.org>:
> Wednesday, August 29, 2007, 8:25:15 AM, you wrote:
>> Quoting Ken Rickard <agentrickard at gmail.com>:
>>> @Earnie: Take a look at the FeedAPI SoC project, it includes pluggable XML
>>> parsing and node creation hooks.
>>>> *Note* we are aware
>>>> of all the existing modules and API and our plans are to use the
>>>> existing things as well as create what is missing.
>> XML is only one data format. GDF hopes to provide a means to I/O more
>> formats than XML or RSS. There is a huge commercial need to be able to
>> create nodes out of any format of feed.
> Hi Earnie,
> I am warming up an old thread here and I am possibly overlooking sth,
> so please apologize if I ask sth that's available on the thread.
> What's GDF? Sounds a lot like FeedAPI. Have you checked it out?
See http://portallink.linkmatics.com/gdf for an explanation of GDF.
You'll see that FeedAPI is one of the options.
> We are about to iron out the details on it, reviews, suggestions and
> patches are more than welcome at this moment.
I am having great success with the Feedparser module which is a
replacement for the aggregator module (cannot be used with aggregator
activated) and some Q&D php to map data in to RSS out with the
description preformatted in a table layout to display the image with
the text. I need to use the full html filter for the data display.
Actually one of the feeds I receive worked with Feedparser with no
manipulation required and I patterned the Q&D after it.
I tried out FeedAPI with it predefined node processor and one of my
feeds. It refused to use the full html regardless of the fact that I
specified full in two places. Since I had something working I didn't
look into what I would need to make the processor work but added that
to my round tuit list.
The data I am parsing and processing are of the affiliate publishing
nature. The data can be in any format and the description data may
need filters as well. The description filter would be the same for the
feeds coming from the same affiliate program. I also see a need for
category filters. The data provided contains the taxonomy that each
provider uses and that needs mapped to the site taxonomy structure.
The feed pull management also needs a filter for time of day. I have
one program that provides data between the hours of x and y so there
isn't a need to try outside of that time. But I would like to try
every hour between x and y. Also we need to limit the number of pulls
in that hour. And I just remembered, the FeedAPI node processor
doesn't allow all of the data in the feed to be populated; only x
number. I need 100% of the data.
Earnie -- http://for-my-kids.com/
More information about the development