[drupal-devel] [bug] Aggregator module: incorrect process html in description when title is blank
Issue status update for http://drupal.org/node/19573 Project: Drupal Version: 4.5.2 Component: aggregator.module Category: bug reports Priority: normal Assigned to: Anonymous Reported by: edhel Updated by: edhel Status: active It's problem of aggregator.module in Drupal 4.5.x and 4.6 also. I have detected this problem when "aggregate" posts from LiveJournal.com which have blank titles and html in description at the same time. In aggregator.module in aggregator_parse_feed() at line 501 (Drupal 4.5.2) there is such code: <?php if ($item['TITLE']) { $title = $item['TITLE']; } else { $title = preg_replace('/^(.*)[^\w;&].*?$/', "\\1", truncate_utf8($item['DESCRIPTION'], 40)); } ?> I suppose that it is incorrect preg_replace call as since: 1) it doesn't cut html tags 2) it may incorrectly work with national chars (i.e. \w): maybe ereg_replace is better solution? For correct processing I changed this code to such: <?php if ($item['TITLE']) { $title = $item['TITLE']; } else { // $title = preg_replace('/^(.*)[^\w;&].*?$/', "\\1", truncate_utf8($item['DESCRIPTION'], 40)); $title = ereg_replace('<.*?(>|$)', '', truncate_utf8($item['DESCRIPTION'], 40)); $title = ereg_replace('/^(.*?)[^[:alnum:];&].*?$/', "\\1", $title) . "..."; } ?> edhel
Issue status update for http://drupal.org/node/19573 Project: Drupal -Version: 4.5.2 +Version: cvs Component: aggregator.module Category: bug reports Priority: normal Assigned to: Anonymous Reported by: edhel Updated by: Morbus Iff Status: active Wouldn't you want to remove the HTML from the description FIRST, and then get the first 40 characters from the remainder? The proposed code could still return no title, especially if the first 40 characters of the description are something like "[a href="http://www.disobey.com/"][strong]an example of an empty title based on a 40 character trunc before HTML removal[/strong][/a]". As for the i18n stuff, I know nothing about it, so someone else will have to address the change to ereg instead of preg. Morbus Iff Previous comments: ------------------------------------------------------------------------ March 27, 2005 - 23:59 : edhel It's problem of aggregator.module in Drupal 4.5.x and 4.6 also. I have detected this problem when "aggregate" posts from LiveJournal.com which have blank titles and html in description at the same time. In aggregator.module in aggregator_parse_feed() at line 501 (Drupal 4.5.2) there is such code: <?php if ($item['TITLE']) { $title = $item['TITLE']; } else { $title = preg_replace('/^(.*)[^\w;&].*?$/', "\\1", truncate_utf8($item['DESCRIPTION'], 40)); } ?> I suppose that it is incorrect preg_replace call as since: 1) it doesn't cut html tags 2) it may incorrectly work with national chars (i.e. \w): maybe ereg_replace is better solution? For correct processing I changed this code to such: <?php if ($item['TITLE']) { $title = $item['TITLE']; } else { // $title = preg_replace('/^(.*)[^\w;&].*?$/', "\\1", truncate_utf8($item['DESCRIPTION'], 40)); $title = ereg_replace('<.*?(>|$)', '', truncate_utf8($item['DESCRIPTION'], 40)); $title = ereg_replace('/^(.*?)[^[:alnum:];&].*?$/', "\\1", $title) . "..."; } ?>
participants (2)
-
edhel -
Morbus Iff