Looks like path_auto is(or was) in use, noticing that links/equipment_tools_supplies_farm looks very much like a path_auto generated URL. My guess is that there might be some collision between path_auto links generated by the taxonomy vs. path_auto links generated for the content. This is just a guess, but you might try examining your path_auto rules here to make sure the same prefixes aren't being used for taxonomy terms vs. pages.
As a corrective step, you might consider disabling path_auto, then go in and "administer url aliases" to remove some of the bogus or conflicting aliases. If you find a colision on the path_auto rules, correct the collision and force an update of all links and taxonomy links.
Another thing to check and report back on is, "Where are these 404's coming from?" What is the referring page (If you can find that in your web server logs). Give you a clue as to the source of the navigational error. It could also be relative links in embedded content created by tinyMCE or some such. Viewing the source of the refferring page ought to help you track that down. Does the source contain the double links?
Good luck.
-----Original Message----- From: support-bounces@drupal.org [mailto:support-bounces@drupal.org] On Behalf Of Gregg Banse Sent: Tuesday, December 06, 2005 7:21 AM To: support@drupal.org Subject: [support] messed up URLs
Hello, My apologies in advance - I've been trying to find a solution to the following issue. I have a feeling the answer must be in the forums but I can't seem to find it.
I've noticed from day 1 of using Drupal that I have an unusual number of 404's due to URLs that do not exist. Portions of the URL path do exist but not the full URL. Example:
/links/links/equipment_tools_supplies_farm
"links" exists as does "links/equipment_tools_supplies_farm" but not the full URL
I don't know where these are coming from.
My install is Drupal 4.6.3 with the following modules: atom - not in use contact_dir event feedback flexinode forms image img_assist node_metatags pathauto - not in use print scheduler sitemenu survey taxonomy_access taxonomy_context tinymce
A nudge in the proper direction would be most appreciated.
Thanks Gregg
-- [ Drupal support list | http://lists.drupal.org/ ]
Thanks for the responses,
Looks like path_auto is(or was) in use, noticing that
path_auto was in use for a few hours several months ago but then I opted to use URL aliases of my own creation.
links/equipment_tools_supplies_farm looks very much like a path_auto generated URL. My guess is that there might be some collision between path_auto links generated by the taxonomy vs. path_auto links generated for the content. This is just a guess, but you might try examining your path_auto rules here to make sure the same prefixes aren't being used for taxonomy terms vs. pages.
I couldn't find any duplicates in the list of aliases.
Another thing to check and report back on is, "Where are these 404's coming from?" What is the referring page (If you can find that in your web server logs). Give you a clue as to the source of the navigational error. It could also be relative links in embedded content created by tinyMCE or some such. Viewing the source of the refferring page ought to help you track that down. Does the source contain the double links?
Here are two lines from my logs.
ac5-webproxy26.direcpc.com - - [06/Dec/2005:12:11:30 -0700] "GET /growing-vegetables/tomatoes HTTP/1.0" 200 17214 "http://www.google.com/search?sourceid=navclient&ie=UTF- 8&rls=GGLG,GGLG:2005-19,GGLG:en&q=tomatoes+in+newspaper" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; .NET CLR 1.1.4322)"
ac5-webproxy26.direcpc.com - - [06/Dec/2005:12:11:31 -0700] "GET /growing-vegetables/modules/event/event.js HTTP/1.0" 404 9595 "http://www.farm-garden.com/growing-vegetables/tomatoes" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; .NET CLR 1.1.4322)"
The first GET looks like a legitimate search query at Google and in fact delivers a legitimate page on my site when you click on the link from the SERP. The second GET though .... And as you can see it delivers a 404.
I went to http://www.farm-garden.com/growing-vegetables/tomatoes and checked all of the links and noted they all seem fine (e.g. they point to legitimate resources).
FYI - I have noted on occasion that some page elements are not delivered - as if the webserver times out. These are typically images and it doesn't always occur. Makes me wonder if I've missed something in the setup or should somehow tune it.
TinyMCE is being used mostly for outbound links by my writers. I hand code all of my links.
G. gbanse@farm-garden.com
Farm & Garden 5 Heaton Street Montpelier, Vermont 05602 802-223-6101 website: www.farm-garden.com forums: forums.farm-garden.com
Looks to me as if you have been bitten by this bug: http://drupal.org/node/13148
I wouldn't hold my breath waiting for this one to be fixed.
...R.
Looks to me as if you have been bitten by this bug: http://drupal.org/node/13148
After rereading it, I think you're right - I missed the part about proxy servers.
When I read that thread and I lost track or who was talking about which issue. Correct if I'm wrong but it seems there are two opposing desires. One is to maintain standards at the cost of SE rankings and visitors, while the other wants the world to see their website properly.
Someone in there also said this issue only affected the bots. I disagree. From my logs:
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; .NET CLR 1.1.4322)" ac5-webproxy26.direcpc.com - - [06/Dec/2005:12:11:31 -0700] "GET /growing- vegetables/modules/event/event.js HTTP/1.0" 404 9595 "http://www.farm- garden.com/growing-vegetables/tomatoes" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; .NET CLR 1.1.4322)"
I believe that's a legitimate web browser coming through a DirecPC proxy server. They found the first page fine but messed up the second page. So to say this issue is strictly limited to bots which don't follow standards is incorrect.
I will discuss this with one of the Google engineers - if I can't get his attention long enough. ;)
Gregg