[development] Wrong usage of urlencode and drupal_urlencode
Steven Wittens
steven at acko.net
Fri Apr 13 19:38:50 UTC 2007
>> "This function is convenient when encoding a string to be used in
>> a query part of a URL, as a convenient way to pass variables to
>> the next page."
>> Currently the urlencode function is not only used for passing the
>> url but also for displaying it, using l() for instance. Many
>> modules experience errors because of this issue: upload,
>> filefield, image, imagecache, .. Also just linking to a filename
>> using l() fails. The value field always contains the correct
>> filename (for example "this is a+test.pdf") while the actual link
>> refers to the urlencoded filename (for example "this+is+a%
>> 2Btest.pdf" or "this+is+a+test.pdf") on which it breaks, since
>> that file does not exist.
>> I feel that while url() shouldn't be altered, l() should urldecode
>> (or have a drupal_urldecode) the output created by url().
>> Currently all those modules are fixing this issue themselves, or
>> are not addressing the issue at all, while I believe core should
>> address this.
>> Wim
>
I'm sorry, but both of you are missing the point. As has been
explained elsewhere, in docs and in many issues before: Drupal menu
paths are not the same as physical file paths. The first may be
prefixed with "?q=", are passed in as GET query values (even with
clear URLs on, due to mod_rewrite) and can contain arbitrary
characters and Unicode.
To convert a menu path to a (relative) URL, we also need to urlencode
it, to make sure we get exactly the same string back in $_GET['q']
(which PHP urldecodes for us).
e.g. the menu path "search/node/Quelque-chose en Français" results in
the URL "?q=search/node/Quelque-chose+en+Fran%C3%A1ais". When you
point your browser to that page, $_GET['q'] in PHP will contain the
original menu path "search/node/Quelque-chose en Français".
To convert a file path to a (relative) URL, all you need to do is
prefix it with the base_path(). If you'd pass it through url() and
have clear URLs off, you'd get a link to "?q=path/to/file" instead of
the actual file.
Now, the ability for url() and l() to take and process full URLs is a
different matter entirely, and is useful both for the end user
(external menu items) as well as coders (to cleanly add e.g. query
string arguments to a URL without worrying about '?' and '&' and
urlencoding). So if you want to manipulate urls to files with these
functions, prefix the file path with the full $base_url, pass that in
and go to town.
In any case, there is nothing 'wrong' with the current use of
urlencodes in Drupal's links. It preserves exactly the same $_GET
['q'] value that you put into l() / url(), regardless of your clean
URL configuration. Adding a random urlencode / urldecode somewhere in
the chain would destroy that property. Remember that we also have to
work around some nasty mod_rewrite bugs (it's why we use
drupal_urlencode) which cause problems even for normal, non-Unicode
paths. The current approach is well-tested, solid and works for the
cases it is designed for.
Steven Wittens
More information about the development
mailing list