[support] hex codes in node_import files
Marty Landman
mlandman at face2interface.com
Fri Jan 8 23:27:21 UTC 2010
I got to answer my own question. Pretty simple after all, I found all
the characters in the field data that mapped to 0x80, or decimal 128
and up. Then converted them to HTML special character codes by
prefixing with '&#' and suffixing ';'. So
146 -> ’
Seems to have worked well enough for my purpose although I don't know
for sure that this accounts for all possibilities.
+++++++++++++++++++++++++++++++++++++++++++++
I'm porting CCK content from a redesigned D4 site to D6 by extracting
the data in a PHP script, generating CSV files then using node_import
to create the new nodes. It's working out pretty well so far but I
hit the following snag on one of the content types. There are hex
encoded characters in the D4 content, somewhere in the importation
process the text gets truncated at the first appearance of one of
these eg. 0x99, 0x93, 0x94.
Am thinking the easiest way to handle it would be to preprocess the
data in my PHP script, converting to what it really should be anyway
ie. HTML special characters such as the following:
0x99 -> ™
0x93 -> “
0x94 -> ”
Anyone know of an existing utility to do this for me? Otherwise I can
code up a conversion method to accomplish it.
Marty
More information about the support
mailing list