t() placeholder changes in HEAD
A rather big patch was committed which affects t() in HEAD and thus pretty much every module out there. It's related to XSS and proper output conversion of text, so do pay attention ;). The good news is, if you didn't care about check_plain() before, now some possible XSS holes should be automagically fixed. For more info, see: http://drupal.org/node/64279#t-placeholders Steven Wittens
A rather big patch was committed which affects t() in HEAD and thus pretty much every module out there. It's related to XSS and proper output conversion of text, so do pay attention ;).
What was the nid for this? I'd like to know the reasoning/discussion behind using the existing % for the least obvious (theme_placeholder) and the new and entirely unknown @ for the most obvious (check_plain). -- Morbus Iff ( think about the good things that I did for you ) Technical: http://www.oreillynet.com/pub/au/779 Culture: http://www.disobey.com/ and http://www.gamegrene.com/ icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus
What was the nid for this? I'd like to know the reasoning/discussion behind using the existing % for the least obvious (theme_placeholder) and the new and entirely unknown @ for the most obvious (check_plain).
Ne'ermind. /me reads http://drupal.org/node/76802. -- Morbus Iff ( accept no prostitutes ) Technical: http://www.oreillynet.com/pub/au/779 Culture: http://www.disobey.com/ and http://www.gamegrene.com/ icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus
What was the nid for this? I'd like to know the reasoning/ discussion behind using the existing % for the least obvious (theme_placeholder) and the new and entirely unknown @ for the most obvious (check_plain).
Theme('placeholder') is used more in core than check_plain(), at least for t(). Steven Wittens
On Aug 18, 2006, at 5:39 AM, Morbus Iff wrote:
I'd like to know the reasoning/discussion behind using the existing % for the least obvious (theme_placeholder) and the new and entirely unknown @ for the most obvious (check_plain).
in case anyone's confused about this, theme_placeholder() calls check_plain() for you, so from a security perspective, the 2 are equivalent. the only real difference is '%' (theme_placeholder) will wrap your string in <em> whereas '@' (check_plain) won't. i'm fine with this change to t(), but once again, i must object to the rather insane combinations of layers of drupal code that might or might not call check_plain() or otherwise sanitize your output for you. :( no wonder it's hard for developers to write secure code, correct code. for the life of me, i can't figure out the reasoning behind (and therefore have any intuition about) what functions sanitize my output and which ones don't. theme_* sometimes does format_* often does check_* always does l() and url() do ... how are we supposed to keep this stuff straight? from brute-force repetition of reading api.drupal.org, i can kind of remember now how do do it right, but it's a waste of time and energy, and the whole system is *highly* error prone. there are probably dozens of places that end up sanitizing twice, due to confusion about what what function does the cleaning, and people err'ing on the side of "better safe than sorry" (for example, see http://drupal.org/node/ 79611#comment-126559). it's way too late in the dev cycle for this, but i'd cast a large vote for a much more coherent way of handling output conversion and sanitizing in the next core API. thanks, -derek
how are we supposed to keep this stuff straight? from brute-force repetition of reading api.drupal.org, i can kind of remember now how do do it right, but it's a waste of time and energy, and the whole system is *highly* error prone. there are probably dozens of places that end up sanitizing twice, due to confusion about what what function does the cleaning, and people err'ing on the side of "better safe than sorry" (for example, see http://drupal.org/node/ 79611#comment-126559).
In essence, converting plain text to HTML happens where data is inserted into a set of HTML tags. The problem is that there is no single location that does that in Drupal. In theory it is most appropriate inside the theme layer, but that is also the layer where the most non-programmer-oriented types work (designers). The move for adding check_plain() to t() is part of my desire to have a clean set of APIs for creating HTML, so that in the end, no module will ever do a literal print command with unprocessed variables in it. For themable functions, I'd like to make it so that only safe-for- output variables are passed to them. This fits in with the "page api/ structured output" idea. I'd even go so far as to not pass internal objects such as $node to themable functions (because it leads to unsafe template code). That's something for the future though and would be a rather serious change. In practice, because we have phptemplate.engine sitting between a module and the .tpl.php file, the responsibility is taken away from the themer a bit already. But we need to explicitly separate out this responsibility to a place right before the theme layer, but without clogging up the modules too much. Right now, in practice, the rules are not that hard. If you use a Drupal function to do output for you, output is sanitized in almost every case. In that cases where it isn't, that's because there are many valid use cases where a callee will want to pass in HTML tags (e.g. inline markup for form element titles). Also note that this rule applies to many other ways of outputting text beyond check_plain(). For example, the fact that drupal_mail() calls mime_header_encode() on all header values, or that drupal_query_string_encode() calls urlencode() on everything. I don't think anyone will dispute that if those automatic checks were not there, 90%+ of all code would forget them. In essence, you should never ever do any sort of output without some sort of context-specific data escaping/encoding function. The problem is simply that an incredibly ridiculous amount of people are not aware of this at all, because on the web, all formats are text and their syntaxes are so similar that people think they are the same. Steven Wittens
Derek Wright wrote: [big snip]
it's way too late in the dev cycle for this, but i'd cast a large vote for a much more coherent way of handling output conversion and sanitizing in the next core API.
Hear, hear! +1 Coherent APIs lead to better, more reliable code. :-) ..chrisxj
Op zondag 20 augustus 2006 20:25, schreef Derek Wright:
it's way too late in the dev cycle for this, but i'd cast a large vote for a much more coherent way of handling output conversion and sanitizing in the next core API.
In a previous mail, I pointed out this insanity too. I suggested two things: * either we ONLY sanitize in the theme layer, on the very very last moment. the moment a sting is HTMLified. * OR we sanitize it all before passing along to the theme layer, themes get somehow HTMLiefied and clean strings. And maybe there are other sane places to do the sanitizing. My preference goes out to the first: It is the most clear, and the most consistent. It also insures us that themes get really raw data. and not some already-prepared HTML-ified data. A theme is about HTML-ifying it. A theme is the only place that really knows how sane it needs the data. Some people then said that they would never trust their security in the hands of themers, and rather keep it in the modules. A valid point too. Dries agreed at that time that a single place to do all our security is badly needed. But after that thread it all dropped silent, because we were promised that fapi2 would take care of this. I beleive we can still *agree* on a single location, and then work towards that with small patches, one place a time. We don't need uge projects like fapi2. We can do it one-patch-a-time too. :) Bèr 'SpagettiSecurity' Kessels
participants (5)
-
Bèr Kessels -
Chris Johnson -
Derek Wright -
Morbus Iff -
Steven Wittens