[development] cURL and drupal_http_request do not properly download certain Google News feeds

Alex Barth alex at developmentseed.org
Tue Jan 19 19:15:52 UTC 2010


After getting a report that

http://news.google.com/news?pz=1&hl=ar&q=سوريا&cf=all&output=rss

is not properly downloading with Feeds module, I dug deep and  
discovered that cURL and drupal_http_request() return an RSS feed with  
no items, while wget and PHP stream_get_contents() do return a full  
RSS feed with a number of items.

Details here: http://drupal.org/node/689552

I am unsure what is actually causing this peculiar behavior and I  
would appreciate people's input. The issue affects not just Feeds but  
any other Drupal module that downloads and processes Google News RSS  
feeds - including core aggregator.

- This seems to be an issue where Google News decides, based on some  
request parameters, what content to return and what not - or am I  
missing something?
- The user agent is the same in cases where the issue occurs and where  
it doesn't, I am using the same machine for all tests - what else  
could Google use to distinguish my requests?
- Any tips on an 'HTTP monitor' I could be using to actually monitor  
outgoing HTTP requests from my local machine?

Alex Barth
http://www.developmentseed.org/blog
tel (202) 250-3633






More information about the development mailing list