[development] Wrong usage of urlencode and drupal_urlencode

Steven Wittens steven at acko.net
Fri Apr 13 19:38:50 UTC 2007


>> "This function is convenient when encoding a string to be used in  
>> a query part of a URL, as a convenient way to pass variables to  
>> the next page."
>> Currently the urlencode function is not only used for passing the  
>> url but also for displaying it, using l() for instance. Many  
>> modules experience errors because of this issue: upload,  
>> filefield, image, imagecache, .. Also just linking to a filename  
>> using l() fails. The value field always contains the correct  
>> filename (for example "this is a+test.pdf") while the actual link  
>> refers to the urlencoded filename (for example "this+is+a% 
>> 2Btest.pdf" or "this+is+a+test.pdf") on which it breaks, since  
>> that file does not exist.
>> I feel that while url() shouldn't be altered, l() should urldecode  
>> (or have a drupal_urldecode) the output created by url().  
>> Currently all those modules are fixing this issue themselves, or  
>> are not addressing the issue at all, while I believe core should  
>> address this.
>> Wim
>


I'm sorry, but both of you are missing the point. As has been  
explained elsewhere, in docs and in many issues before: Drupal menu  
paths are not the same as physical file paths. The first may be  
prefixed with "?q=", are passed in as GET query values (even with  
clear URLs on, due to mod_rewrite) and can contain arbitrary  
characters and Unicode.


To convert a menu path to a (relative) URL, we also need to urlencode  
it, to make sure we get exactly the same string back in $_GET['q']  
(which PHP urldecodes for us).

e.g. the menu path "search/node/Quelque-chose en Français" results in  
the URL "?q=search/node/Quelque-chose+en+Fran%C3%A1ais". When you  
point your browser to that page, $_GET['q'] in PHP will contain the  
original menu path "search/node/Quelque-chose en Français".


To convert a file path to a (relative) URL, all you need to do is  
prefix it with the base_path(). If you'd pass it through url() and  
have clear URLs off, you'd get a link to "?q=path/to/file" instead of  
the actual file.


Now, the ability for url() and l() to take and process full URLs is a  
different matter entirely, and is useful both for the end user  
(external menu items) as well as coders (to cleanly add e.g. query  
string arguments to a URL without worrying about '?' and '&' and  
urlencoding). So if you want to manipulate urls to files with these  
functions, prefix the file path with the full $base_url, pass that in  
and go to town.

In any case, there is nothing 'wrong' with the current use of  
urlencodes in Drupal's links. It preserves exactly the same $_GET 
['q'] value that you put into l() / url(), regardless of your clean  
URL configuration. Adding a random urlencode / urldecode somewhere in  
the chain would destroy that property. Remember that we also have to  
work around some nasty mod_rewrite bugs (it's why we use  
drupal_urlencode) which cause problems even for normal, non-Unicode  
paths. The current approach is well-tested, solid and works for the  
cases it is designed for.

Steven Wittens



More information about the development mailing list