I'm looking for something similar to the html2text, w3m -dump , links -dump to turn an html input in a well formatted text (using padding etc... to render tables).
Is there anything pre-cooked in drupal?
html2text has a nice output but a) I need to fork b) it looks it doesn't support utf8.
The overall target is to write HTML email using a template and avoid to rewrite text email templates.
On Mon, 7 Apr 2008 16:45:46 +0200 Ivan Sergio Borgonovo mail@webthatworks.it wrote:
I'm looking for something similar to the html2text, w3m -dump , links -dump to turn an html input in a well formatted text (using padding etc... to render tables).
Is there anything pre-cooked in drupal?
html2text has a nice output but a) I need to fork b) it looks it doesn't support utf8.
The overall target is to write HTML email using a template and avoid to rewrite text email templates.
Since it doesn't seem anything around that really do the job I came up with:
function textify($html) { $descriptorspec = array( 0 => array("pipe", "r"), 1 => array("pipe", "w"), 2 => array("file", "/dev/null", "a") ); $cwd = '/tmp'; $env = array('LANG' => 'en_US.UTF-8'); $process = proc_open('w3m -dump -cols 68 -T text/html', $descriptorspec, $pipes, $cwd, $env); if (is_resource($process)) {
fwrite($pipes[0], $html); fclose($pipes[0]);
$text=stream_get_contents($pipes[1]); fclose($pipes[1]); } return $text; }
I hate to fork, but still writing an html parser was not something I was planning to do over night. It seems there is an HTML::[forgetwhat] in perl to do a similar job as w3m -dump. I didn't find anything in pear... If anyone know any good library... I'd be glad to kick out that proc_open from my code.
Now it's time to do some security assessment and see if there is any need to filter the incoming $html.
links, lynx didn't seem to cope well with utf8.