[development] t() placeholder changes in HEAD

Steven Wittens steven at acko.net
Sun Aug 20 22:24:42 UTC 2006


>
> how are we supposed to keep this stuff straight?  from brute-force  
> repetition of reading api.drupal.org, i can kind of remember now  
> how do do it right, but it's a waste of time and energy, and the  
> whole system is *highly* error prone.  there are probably dozens of  
> places that end up sanitizing twice, due to confusion about what  
> what function does the cleaning, and people err'ing on the side of  
> "better safe than sorry" (for example, see http://drupal.org/node/ 
> 79611#comment-126559).

In essence, converting plain text to HTML happens where data is  
inserted into a set of HTML tags. The problem is that there is no  
single location that does that in Drupal. In theory it is most  
appropriate inside the theme layer, but that is also the layer where  
the most non-programmer-oriented types work (designers).

The move for adding check_plain() to t() is part of my desire to have  
a clean set of APIs for creating HTML, so that in the end, no module  
will ever do a literal print command with unprocessed variables in it.

For themable functions, I'd like to make it so that only safe-for- 
output variables are passed to them. This fits in with the "page api/ 
structured output" idea. I'd even go so far as to not pass internal  
objects such as $node to themable functions (because it leads to  
unsafe template code). That's something for the future though and  
would be a rather serious change. In practice, because we have  
phptemplate.engine sitting between a module and the .tpl.php file,  
the responsibility is taken away from the themer a bit already. But  
we need to explicitly separate out this responsibility to a place  
right before the theme layer, but without clogging up the modules too  
much.

Right now, in practice, the rules are not that hard. If you use a  
Drupal function to do output for you, output is sanitized in almost  
every case. In that cases where it isn't, that's because there are  
many valid use cases where a callee will want to pass in HTML tags  
(e.g. inline markup for form element titles).

Also note that this rule applies to many other ways of outputting  
text beyond check_plain(). For example, the fact that drupal_mail()  
calls mime_header_encode() on all header values, or that  
drupal_query_string_encode() calls urlencode() on everything. I don't  
think anyone will dispute that if those automatic checks were not  
there, 90%+ of all code would forget them.

In essence, you should never ever do any sort of output without some  
sort of context-specific data escaping/encoding function. The problem  
is simply that an incredibly ridiculous amount of people are not  
aware of this at all, because on the web, all formats are text and  
their syntaxes are so similar that people think they are the same.

Steven Wittens



More information about the development mailing list