[development] cURL and drupal_http_request do not properly download certain Google News feeds

Alex Barth alex at developmentseed.org
Tue Jan 19 21:15:47 UTC 2010


On Jan 19, 2010, at 4:04 PM, Khalid Baheyeldin wrote:

> On Tue, Jan 19, 2010 at 2:15 PM, Alex Barth  
> <alex at developmentseed.org> wrote:
>
> After getting a report that
>
> http://news.google.com/news?pz=1&hl=ar&q=سوريا&cf=all&output=rss
>
> Could be a UTF-8 issue? The q= has "Syria"  (in Arabic) in it. Is  
> that stripped out
> somewhere in some layer in Drupal?

bangpound pointed that out on the issue queue, too. Indeed url  
encoding the arabic string fixes the behavior I described - my guesses  
that Google News might require special request parameters were simply  
not  on the right track.

What I am not clear about now is whether wget and PHP streams do  
better URL sanitation before doing the request or if non ASCII  
characters are allowed in an HTTP URL but curl doesn't support it.


> -- 
> Khalid M. Baheyeldin
> 2bits.com, Inc.
> http://2bits.com
> Drupal optimization, development, customization and consulting.
> Simplicity is prerequisite for reliability. --  Edsger W.Dijkstra
> Simplicity is the ultimate sophistication. --   Leonardo da Vinci

Alex Barth
http://www.developmentseed.org/blog
tel (202) 250-3633






More information about the development mailing list