On Jan 19, 2010, at 4:04 PM, Khalid Baheyeldin wrote:
On Tue, Jan 19, 2010 at 2:15 PM, Alex Barth <alex@developmentseed.org> wrote:
After getting a report that
http://news.google.com/news?pz=1&hl=ar&q=سوريا&cf=all&output=rss
Could be a UTF-8 issue? The q= has "Syria" (in Arabic) in it. Is that stripped out somewhere in some layer in Drupal?
bangpound pointed that out on the issue queue, too. Indeed url encoding the arabic string fixes the behavior I described - my guesses that Google News might require special request parameters were simply not on the right track. What I am not clear about now is whether wget and PHP streams do better URL sanitation before doing the request or if non ASCII characters are allowed in an HTTP URL but curl doesn't support it.
-- Khalid M. Baheyeldin 2bits.com, Inc. http://2bits.com Drupal optimization, development, customization and consulting. Simplicity is prerequisite for reliability. -- Edsger W.Dijkstra Simplicity is the ultimate sophistication. -- Leonardo da Vinci
Alex Barth http://www.developmentseed.org/blog tel (202) 250-3633