[drupal-devel] [bug] Teaser always truncated at 600 characters

neale drupal-devel at drupal.org
Mon Aug 22 18:22:15 UTC 2005


Issue status update for 
http://drupal.org/node/28211
Post a follow up: 
http://drupal.org/project/comments/add/28211

 Project:      Drupal
 Version:      cvs
 Component:    node.module
 Category:     bug reports
 Priority:     normal
 Assigned to:  Anonymous
 Reported by:  neale
 Updated by:   neale
 Status:       patch (code needs review)

It just occurred to me that when you said "break tag" you may have meant
the "br" tag and not the "!--break--" thing.  In that case, your second
point is a little closer to true, but still not quite correct.  My
patch will use as many sentences as it can before it hits the max. size
(note the change from strpos to strrpos).


Perhaps the help for the "size" should be changed to read *minimum*
size of a teaser, instead of maximum size for a teaser, if the current
behavior is what drupal is supposd to do. 


Currently, the teaser will always be larger than the "maximum size"
unless the !--break-- thing is used.  You can test this here by
creating a new book page with a very short paragraph followed by an
incredibly long one, maybe 16000 characters in multiple paragraphs. 
The incredibly long one is always part of the "trimmed version".


If you'd like to see my patch in action, head on over to , where it is
currently running.




neale



Previous comments:
------------------------------------------------------------------------

Thu, 04 Aug 2005 19:18:23 +0000 : neale

Attachment: http://drupal.org/files/issues/diff_27 (1.16 KB)

The truncation code in node_teaser doesn't make any sense to me:



<?php
 foreach ($breakpoints as $point => $charnum) {
   if ($length = strpos($body, $point, $size)) {
     return substr($body, 0, $length + $charnum);
   }
 }
?>




I don't understand: why is the offset given to strpos the maximum
teaser size?  This would cause teasers to only be nicely split if they
are enough longer than the max. teaser size to have one of the special
characters.  But this guarantees the teaser will either be abruptly
truncated, or over the max. teaser size!  In fact, this is what I'm
seeing on my installation.


Attached is a patch that makes it work the way I think it was intended.
 At least, it makes more sense to me.




------------------------------------------------------------------------

Sun, 21 Aug 2005 19:49:15 +0000 : killes at www.drop.org

changing version and status.




------------------------------------------------------------------------

Mon, 22 Aug 2005 11:36:11 +0000 : Goba

Neale, your code does not reflect the intentions in Drupal. The problem
with this patch is that it forces a truncate call on the body,
regardless of content, and then reduces the teaser size even more, to
the first paragraph, first sentence, etc. Problems with your approach
are:



* If the first paragraph is too short, it will be the teaser,
regardless of the set size.
* If there are no paragraph or break tags in the truncated teaser part,
the first sentence will be used (regardless of how short is it).
* On the other hand, if the first paragraph is too long, and the
closing tag is out of the initial truncated part, you will end up with
broken HTML.

Now all these are not the intention of the teaser generation code.
Having the exact set length or shorter is sacrificed for having valid
HTML, and complete content. While the current Drupal code might produce
too long teasers, your code can easily end up producing too short ones.
A teaser for this text with your patch would be: /Neale, your code does
not reflect the intentions in Drupal./




------------------------------------------------------------------------

Mon, 22 Aug 2005 18:06:22 +0000 : neale

(Oddly, previewing this gives no trimmed version and no full version.  I
hope the formatting here looks okay.)


Goba, I'm not sure I understand your objections.  You say:



* 
"If the first paragraph is too short, it will be the teaser, regardless
of the set size.

"
I presume you mean that an article with two paragraphs (one short, one
long) would be split at the end of the first paragraph.


Currently, if I write an article with three paragraphs (short, long,
long), the teaser will be the first two (short, long) even if the
second paragraph runs over the max. teaser size.  In fact, unless the
entire article is under the max. size, it is *guaranteed* that the
teaser will be over the max., since the code only looks for a
breakpoint *after* $size.


I'm not sure this is in line with what drupal wants, since the help for
the max. teaser size states (emphasis mine) it is "The *maximum* number
of characters used in the trimmed version of a post."


My patch addresses this by looking for a split character in the
characters before the max. length.

* 
"
If there are no paragraph or break tags in the truncated teaser part,
the first sentence will be used (regardless of how short is it).


"
This is untrue.  My patch applies after the code that looks for the
break tag, and does not alter the existing logic.  Unfortunately, the
patch doesn't show the (unmodified) part of the code you mention, so it
might be worthwhile to look at the patch alongside the original code to
see exactly what it does.

* 
"
On the other hand, if the first paragraph is too long, and the closing
tag is out of the initial truncated part, you will end up with broken
HTML.


"
This is a different problem already present in the code.  My patch
doesn't try to address this.  (Try it: create a new book page right
here, and begin it with an em tag, then paste the same sentence over
and over, and close the em tag.  The teaser will have an unclosed tag.)


I humbly suggest the reviewer look over the patch again.







More information about the drupal-devel mailing list