Most of the time, you can get to the posts via RSS. Aggregator module does a pretty good job of pulling stuff in, and the author of the post that's displayed is whatever you tell it to display (see Drupal Planet for an example)
I second the recommendation of using QueryPath. I use it almost exclusively along with drupal_http_request, though I use curl only in a few places (if you use curl I recommend http://drupal.org/project/curl for a dependency check). I'd really recommend though creating a custom module that uses the above and then has your logic for filtering in it, I've done this for about a dozen modules now.That said, there are some more modules available out there nowadays, such as using http://drupal.org/project/feeds_xpathparser with feeds http://drupal.org/project/feeds There are about a dozen more modules that will accomplish the goal though I haven't used them, but I went through and tried most of the methods out for some recent projects.Cheers,
Kevin O'BrienDrupal Developer415-754-0112
On Tue, Nov 30, 2010 at 11:26 AM, <development-request@drupal.org> wrote:Send development mailing list submissions to
development@drupal.org
To subscribe or unsubscribe via the World Wide Web, visit
http://lists.drupal.org/mailman/listinfo/development
or, via email, send a message with subject or body 'help' to
development-request@drupal.org
You can reach the person managing the list at
development-owner@drupal.org
When replying, please edit your Subject line so it is more specific
than "Re: Contents of development digest..."
Today's Topics:
1. Drupal module for scraping information from an HTML/XML
document (James Benstead)
2. Re: Drupal module for scraping information from an HTML/XML
document (John Fiala)
3. Easter problem (?mon Tam?s)
4. Re: Easter problem (Carl Wiedemann)
5. Re: Easter problem (larry@garfieldtech.com)
6. Re: Easter problem (jeff@ayendesigns.com)
7. Re: Easter problem (larry@garfieldtech.com)
8. Re: Easter problem (Jennifer Hodgdon)
----------------------------------------------------------------------
Message: 1
Date: Tue, 30 Nov 2010 18:56:09 +0000
From: James Benstead <james.benstead@gmail.com>
Subject: [development] Drupal module for scraping information from an
HTML/XML document
To: development <development@drupal.org>
Message-ID:
<AANLkTi=AFhBkvyURzgwNB54Z+q-rRj_B_uRLZbUUd3UV@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"
I've finally got round to doing some serious work on Drupalversity, an open,
web-based Drupal education project I've had in mind for a year or so.
People who use Drupalversity to learn have the option of adding Resources to
the site - i.e., links to posts at Lullabot, Chapter3 etc that explain how
to do specific things with Drupal. A Resource is a custom content type that
includes a link to the resource and a text field containing a description of
that resource.
What I'd like to do once a Resource has been added to the site is to scrape
certain information from it: at this point I'm thinking the Title of the
page the link points to and the provider of the resource - e.g., which
Drupal shop originally created the resource. What's the best way to go about
doing this? I'm pretty sure there's not a Drupal module that solves the
problem out of the box.
So far I've considered:-------------- next part --------------
- http://drupal.org/project/querypath
- Drupal's built-in drupal_http_request() -
http://api.drupal.org/api/drupal/includes--common.inc/function/drupal_http_request/6
- curl
Thanks,
--Jim
--
My IM and Skype details are at http://state68.com/contact
An HTML attachment was scrubbed...
URL: http://lists.drupal.org/pipermail/development/attachments/20101130/5600f1fe/attachment-0001.html
------------------------------
Message: 2
Date: Tue, 30 Nov 2010 12:06:33 -0700
From: John Fiala <jcfiala@gmail.com>
Subject: Re: [development] Drupal module for scraping information from
an HTML/XML document
To: development@drupal.org
Message-ID:
<AANLkTi=N6WxHfigUC4ZopfxswMBv8bj7BZZJErHmko_T@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1------------------------------
These days, if I'm going to be trying to extract data from html/xml,
I'd use querypath. Give it a try!
On Tue, Nov 30, 2010 at 11:56 AM, James Benstead
<james.benstead@gmail.com> wrote:
> What I'd like to do once a Resource has been added to the site is to scrape
> certain information from it: at this point I'm thinking the Title of the
> page the link points to and the provider of the resource - e.g., which
> Drupal shop originally created the resource. What's the best way to go about
> doing this? I'm pretty sure there's not a Drupal module that solves the
> problem out of the box.
--
John Fiala
www.jcfiala.net
Message: 3
Date: Tue, 30 Nov 2010 20:14:04 +0100
From: ?mon Tam?s <amont@5net.hu>
Subject: [development] Easter problem
To: development@drupal.org
Message-ID:
<AANLkTikmKoVkedks2FkWUbHRq9sNTe6r0iX+iMjmBtvy@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hello,
I have the nameday module (http://drupal.org/project/nameday) and I get a
feature request for the Greek namedays. How I see it is based on the Easter,
what is not an easy thing to count.
Well, I want to find some algorithm for Easter, and similar days, what is
can be stored somehow. Maybe it should be a hook or some other think what
can be stored in database.
Thanks
--
?mon Tam?s
Sitefejleszt? ?s programoz?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.drupal.org/pipermail/development/attachments/20101130/c81e61bf/attachment-0001.html
------------------------------
Message: 4
Date: Tue, 30 Nov 2010 12:22:42 -0700
From: Carl Wiedemann <carl.wiedemann@gmail.com>
Subject: Re: [development] Easter problem
To: development@drupal.org
Message-ID:
<AANLkTinD9Xz=3inJj2GraAuqde_=3yshJDwxCJzu12zr@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-2"
Does this help? http://php.net/manual/en/function.easter-days.php
On Tue, Nov 30, 2010 at 12:14 PM, ?mon Tam?s <amont@5net.hu> wrote:
> Hello,
>
> I have the nameday module (http://drupal.org/project/nameday) and I get a
> feature request for the Greek namedays. How I see it is based on the Easter,
> what is not an easy thing to count.
>
> Well, I want to find some algorithm for Easter, and similar days, what is
> can be stored somehow. Maybe it should be a hook or some other think what
> can be stored in database.
>
>
> Thanks
>
> --
> ?mon Tam?s
> Sitefejleszt? ?s programoz?
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.drupal.org/pipermail/development/attachments/20101130/55b0fb8a/attachment-0001.html
------------------------------
Message: 5
Date: Tue, 30 Nov 2010 13:24:07 -0600
From: "larry@garfieldtech.com" <larry@garfieldtech.com>
Subject: Re: [development] Easter problem
To: development@drupal.org
Message-ID: <4CF54F57.2030602@garfieldtech.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
There's no need for a hook here at all. You can either code in the
algorithm for defining when Easter is (which sounds like it is in fact
rather complicated) or just pre-store know pre-calculated dates for it
for the next decade or so. (10 records, one per year; totally easy.)
Both options are described here, including the different mechanisms for
defining when Easter is in different calendars:
http://en.wikipedia.org/wiki/Easter#Date_of_Easter
--Larry Garfield
On 11/30/10 1:14 PM, ?mon Tam?s wrote:
> Hello,
>
> I have the nameday module (http://drupal.org/project/nameday) and I get
> a feature request for the Greek namedays. How I see it is based on the
> Easter, what is not an easy thing to count.
>
> Well, I want to find some algorithm for Easter, and similar days, what
> is can be stored somehow. Maybe it should be a hook or some other think
> what can be stored in database.
>
>
> Thanks
>
> --
> ?mon Tam?s
> Sitefejleszt? ?s programoz?
>
------------------------------
Message: 6
Date: Tue, 30 Nov 2010 14:23:56 -0500
From: jeff@ayendesigns.com
Subject: Re: [development] Easter problem
To: development@drupal.org
Message-ID: <4CF54F4C.2060409@ayendesigns.com>
Content-Type: text/plain; charset="utf-8"
You can google it, but I believe this is one of those things that cannot
be reduced to an equation or algorithm. It's something like the first
Sunday after the first full moon after the spring equinox.
On 11/30/2010 02:14 PM, ?mon Tam?s wrote:
> Hello,
>
> I have the nameday module ( http://drupal.org/project/nameday) and I
> get a feature request for the Greek namedays. How I see it is based on
> the Easter, what is not an easy thing to count.
>
> Well, I want to find some algorithm for Easter, and similar days, what
> is can be stored somehow. Maybe it should be a hook or some other
> think what can be stored in database.
>
>
> Thanks
>
> --
> ?mon Tam?s
> Sitefejleszt? ?s programoz?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.drupal.org/pipermail/development/attachments/20101130/38791578/attachment-0001.html
------------------------------
Message: 7
Date: Tue, 30 Nov 2010 13:26:23 -0600
From: "larry@garfieldtech.com" <larry@garfieldtech.com>
Subject: Re: [development] Easter problem
To: development@drupal.org
Message-ID: <4CF54FDF.7070506@garfieldtech.com>
Content-Type: text/plain; charset=ISO-8859-2; format=flowed
The Calendar PHP module is not enabled by default in a stock PHP, so I
don't know that you can rely on it (unfortunately). It does have some
cool stuff in it, though.
--Larry Garfield
On 11/30/10 1:22 PM, Carl Wiedemann wrote:
> Does this help? http://php.net/manual/en/function.easter-days.php
>
> On Tue, Nov 30, 2010 at 12:14 PM, ?mon Tam?s <amont@5net.hu
> <mailto:amont@5net.hu>> wrote:
>
> Hello,
>
> I have the nameday module (http://drupal.org/project/nameday) and I
> get a feature request for the Greek namedays. How I see it is based
> on the Easter, what is not an easy thing to count.
>
> Well, I want to find some algorithm for Easter, and similar days,
> what is can be stored somehow. Maybe it should be a hook or some
> other think what can be stored in database.
>
>
> Thanks
>
> --
> ?mon Tam?s
> Sitefejleszt? ?s programoz?
>
>
------------------------------
Message: 8
Date: Tue, 30 Nov 2010 11:21:08 -0800
From: Jennifer Hodgdon <yahgrp@poplarware.com>
Subject: Re: [development] Easter problem
To: development@drupal.org
Message-ID: <4CF54EA4.1050502@poplarware.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
http://php.net/manual/en/function.easter-date.php
On 11/30/2010 11:14 AM, ?mon Tam?s wrote:
> I have the nameday module (http://drupal.org/project/nameday) and I get a
> feature request for the Greek namedays. How I see it is based on the Easter,
> what is not an easy thing to count.
>
> Well, I want to find some algorithm for Easter, and similar days, what is
> can be stored somehow. Maybe it should be a hook or some other think what
> can be stored in database.
--
Jennifer Hodgdon * Poplar ProductivityWare
www.poplarware.com
Drupal web sites and custom Drupal modules
------------------------------
--
[ Drupal development list | http://lists.drupal.org/ ]
End of development Digest, Vol 95, Issue 58
*******************************************