[drupal-devel] [bug] Fix a serious bug and clean up teaser generation

Steven drupal-devel at drupal.org
Fri Apr 15 13:47:13 UTC 2005


Issue status update for http://drupal.org/node/18836

 Project:      Drupal
 Version:      cvs
 Component:    node system
 Category:     bug reports
 Priority:     normal
 Assigned to:  Goba
 Reported by:  Goba
 Updated by:   Steven
 Status:       patch

I'm definitely in Goba's camp. Yes we would be dealing with the PHP
filter as a special case, but then it is a really special filter. In
fact, it is the only filter like it: it does not transform user-input
from one format to the other, it executes it. It is itself a hack, but
one which works very well within the context of the filter system.


By generating teasers and caching them, we would need to regenerate all
teasers if filter settings change. This isn't a problem in Dries'
situation, but it is not clear to me how his approach would deal with
storing teasers in the database. We also need to consider how this
would mesh with custom teasers, like flexinode.module or
excerpt.module.


Using PHP code in Drupal core is only going to increase when we're
going to collect code snippets in the handbook. I don't think treating
the PHP filter as a special case is unacceptable for core: until we see
another code evaluator filter, this whole discussion is unnecessary.


There is another thing to consider: the only reason we can
short-cirtcuit teaser generation for PHP code is because you can decide
yourself in the PHP code whether to output a teaser or the full body.
This is not something you can do when the "no cache" filter in question
is a 'normal' filter without conditional logic in it.


The current filter system works the way it does because people
currently want it to do all those things. Custom evaluation filters is
not one of those things.


The fact is, this patch improves PHP teaser behaviour massively
already. What we have now in core is an even bigger hack. +1 on this
patch, for 4.6 final.




Steven



Previous comments:
------------------------------------------------------------------------

March 13, 2005 - 16:37 : Goba

Attachment: http://drupal.org/files/issues/Drupal.teaser-bugfix-and-cleanup.patch (4.48 KB)

The node teaser generator keeps the whole body as the teaser, if there
is indication that PHP code is included in the body (since it is
unkown, what is going to come out of the PHP parser). The only serious
problem is that it does not check if the PHP filter is included in the
input format, so it does keep the complete body even if there is no PHP
parsing on the code. This breaks PHP code examples in articles or blog
posts, which are not supposed to be parsed, but which require proper
teasers.


The solution is to make the teaser generation code aware of the input
format (pass the whole node), and let it provide the teaser that way,
depending on the input format.


The attached patch also cleans up that very unprofessionaly looking
sequence of copy-pasted stuff in the node_teaser function.




------------------------------------------------------------------------

March 13, 2005 - 16:50 : Goba

Attachment: http://drupal.org/files/issues/Drupal.teaser-bugfix-and-cleanup-alternate.patch (4.07 KB)

Here is an alternate approach, which does not take the node as
reference, but asks for the body and the format, and then returns the
teaser, as before. Pick this or the previous one, or suggest something
else :)




------------------------------------------------------------------------

March 17, 2005 - 14:59 : Goba

Is someone willing to review these fine patches? Either one :)




------------------------------------------------------------------------

March 17, 2005 - 15:57 : JonBob

-1 on using the node reference; some node types need to extract the
teaser from a field or collection of fields not named "body."


The other patch looks fine to me, but I haven't tried running it.




------------------------------------------------------------------------

March 17, 2005 - 17:45 : Goba

JonBob, this is just an internal decision, as node_teaser() is not a
hook implemented, it is just the default logic used to generate the
teaser. Other modules are free to provide a teaser for a node in the
node validation process before this logic is invoked, so that the
custom teaser is not overwritten. Passing on the node reference or the
body and the format is just a matter of programming style...




------------------------------------------------------------------------

March 17, 2005 - 17:49 : JonBob

Not entirely. It seems sensible to me for a module to concatenate two
fields and run them, as a unit, through node_teaser() to generate a
teaser that way. Using the node reference, they would have to abuse the
body field for this (and if one of the two fields in question is "body"
already, that's a sticky situation).




------------------------------------------------------------------------

March 17, 2005 - 17:51 : Goba

Ah, understood, and agreed!




------------------------------------------------------------------------

March 18, 2005 - 10:34 : Steven

Can't this be solved by allowing the passing of NULL as the $format, and
to disable this check in that case? I really like this solution, it's
much better than the current hack.




------------------------------------------------------------------------

March 18, 2005 - 10:35 : Steven

I was talking about the alternate solution by the way.




------------------------------------------------------------------------

March 18, 2005 - 20:46 : Goba

Attachment: http://drupal.org/files/issues/Drupal.teaser-bugfix-and-cleanup-alternate2.patch (4.1 KB)

Steven's suggestion implemented.




------------------------------------------------------------------------

March 20, 2005 - 16:14 : Goba

Seems like everyone is happy with this patch :) Or is that only my
single minded view? ;)




------------------------------------------------------------------------

March 20, 2005 - 21:08 : Dries

I wonder if this is the best way to check for the PHP code filter.  It's
not exactly clean code.  Except for that check, this patch looks good.




------------------------------------------------------------------------

March 20, 2005 - 21:15 : Goba

Dries, the filter format API provides filter_list_format() which returns
an array in this format. I don't think that redefining the filter API is
a good move in this part of the release cycle.




------------------------------------------------------------------------

March 21, 2005 - 23:16 : Dries

But what do we do when I create a filter X which needs special treatment
just like the PHP filter?




------------------------------------------------------------------------

March 21, 2005 - 23:21 : Goba

Node module will not know anything about arbitrary filters, but it is
not supposed to know anything about them either... We can put some new
shiny hook in here to let filters help doing teasers, but I think it is
just overcomplicating the question here. Sure this patch will not solve
issues with arbitrary filters, this just solves an existing bug, and
cleans up code. I don't have a good idea though to handle arbitrary
filters while generating the teaser, and would really like to get the
fix into Drupal for 4.6.0. Any good ideas are welcome!




------------------------------------------------------------------------

March 23, 2005 - 23:39 : Goba

I would be happy to work more on this patch, given a direction...




------------------------------------------------------------------------

March 29, 2005 - 12:07 : TDobes

+1 for this patch.


As for arbitrary filters that may require special handling, that
problem existed before this patch existed.  My thought on solving it
would be to move teaser generation to occur on the filtered text.  This
would solve the problem of needing to deal with special-cases filters
like the PHP code filter and any others that will encounter problems if
their input is truncated.  Of course, moving teaser generation in this
way will cause it to potentially generate unclosed tags which won't be
fixable by an htmlcorrector filter because the filters would have
already done their processing.  Of course, if htmlcorrector were part
of core, we could choose to run it after the teaser generation rather
than with the other filters.


Anyway, those are just thoughts... not necessarily "good ideas." :-)  I
suggest we commit this patch as-is to deal with the PHP code filter as a
special case, and deal with allowing other filters to be special cases
later.  As far as I know, no other filter needs special handling at the
moment other than the PHP filter.




------------------------------------------------------------------------

March 29, 2005 - 17:29 : Goba

TDobes, just a small comment: we don't do teaser generation on the
output to allow dynamic teasers. The PHP filter for example allows for
dynamic information to be put into the text.




------------------------------------------------------------------------

March 29, 2005 - 18:34 : Dries

Generating the teaser from the raw input isn't going to work; this is
illustrated by the PHP filter, but I could come up with a handful of
other examples. The proposed patch is a quick hack; it fails to address
the larger problem. 


Generating the teaser from the output (filtered input) may result in
unexpected behavior, but at least it isn't a hack.  Generating the
teaser from the output would work if the filter system would not cache
all generated output.  I'd have to dig the code but if I remember
correctly, this was within the filter system's capabilities.  For
example, how does it deal with 'non-teaser dynamic output'?


I'd certainly prefer to explore the 'on output' path.




------------------------------------------------------------------------

March 29, 2005 - 18:38 : Goba

Dries, you don't care about unexpected output, as long as it is
generated by some fine code? Really? I needed to read your words
multiple times, but still cannot understand your point here...




------------------------------------------------------------------------

March 29, 2005 - 22:38 : Dries

Goba, not quite.  I suggest to use 'on output teasers' in combination
with _not_ caching dynamic teasers.  Like that, there won't be
unexpected behavior because PHP-based teasers wouldn't get cached. 
This should be fairly easy to accomplish because the filter system
supports this already.  Open filter.module and search for 'no cache'. 
Makes sense?




------------------------------------------------------------------------

March 29, 2005 - 23:03 : Goba

Aham, so 'on output teaser' generation would mean that teasers would be
dynamically generated all the time, and not stored in the DB? The
suggested name of the approach implies that. Or I just got totally lost
in the discussion... I increasingly feel like wasting your time asking
to explain to me what you are up to... And most of the time I arrive at
dead ends. Like changing the teaser generation so dramatically in this
phase of the 4.6 release process does not seem to be a feasible
project, although it might make sense for 4.7.




------------------------------------------------------------------------

March 29, 2005 - 23:22 : Dries

Goba: most teasers would get cached except those that are specified to
be dynamic (i.e. those who are generated by a filter with the 'no
cache' attribute set).  The current filter system only caches filtered
output if it can be cached.  For example, it does not cache filtered
output that is the result of the PHP filter.  This shouldn't (and
wouldn't) be any different for teasers.  See filter_format_allowcache()
and check_output() for details.


Either way, I'll give it some more thought.  Like, I want to
investigate how this affects teasers being stored in the database.




------------------------------------------------------------------------

April 8, 2005 - 20:16 : Goba

Well, this would surely be nice to fix in 4.6.







More information about the drupal-devel mailing list