[support] Node Import from HTML Files

Earnie Boyd earnie at users.sourceforge.net
Thu May 3 20:19:15 UTC 2012


On Thu, May 3, 2012 at 11:17 AM, Fred Jones <fredthejonester at gmail.com> wrote:
> I have a D7 site--it's now live and doing well and the next step is to
> import a large set of staff bios. They are right now stored as PHP
> files, in a very organized format. They have these fields:
>
>    Name
>    Title
>    Bio
>    Image
>
> which must be parsed out of the HTML, and for Image, I must actually
> of course import the image as an image file. It looks like I would NOT
> have to parse the PHP b/c the PHP part is unrelated to the content
> items I need. I just need to parse out, for example, in a div with
> class="title" there is an H2 with his name and a P with the person's
> title.
>
> I think I can an XML parser in PHP to get those items.
>
> My first question is, should I be looking into the Feeds module, the
> Migrate module or do this with custom code? Once I get started
> hopefully I can finish. :)

So is this more for individual user profiles or one page giving the
staff data to anonymous users?  It sounds like more than one page so
you could create a staff content type and add fields for data.  If all
the pages use well formed xhtml using the XML parser might be good but
otherwise you need the DOM parser.  I think I would custom code it but
I don't know what you have exactly.

-- 
Earnie
-- https://sites.google.com/site/earnieboyd


More information about the support mailing list