[development] Can .htaccess discard part of a path?

Seth Freach sfreach at gmail.com
Tue Nov 10 21:05:25 UTC 2009


Hi Nancy.

I haven't tested this, but try:

  RewriteCond %{REQUEST_URI} 
^/index.php?q=cgi-bin/printOriginal\.pl&file=.*$ [NC]
  RewriteRule ^/index.php?q=cgi-bin/printOriginal\.pl&file=/(.*)$ /$1 
[L,R=301]

And see if that can give you a place to start.  The above assumes that 
clean URLs will translate it to 'index.php?q=' later.  This is so that 
the 301 redirect (which google will remember) will be to a Clean URL.  
If not desired to function like this, you can change the last "/$1" in 
the above example to: "/index.php?q=$1" .

You can throw some "RewriteCond %{HTTP_HOST}" lines in there too and 
change those also if you want to preserve the SEO value of links to old 
domains as well, but that's probably a topic for another list, I'd guess.

Seth

(A google search for "printOriginal.pl" turned up a few momsteam.com links.)

Nancy Wichmann wrote:
>
> Wow, how did you know about MomsTeam (now YouthSportsParents)?
>
>  
>
> I put this in there already    RewriteRule ^cgi-bin/printOriginal.pl/$ 
> http://www.example.com [R=301,L]
>
> And I am still seeing these come through to the Drupal log.
>
>  
>
> There might be a clue in RewriteRule ^alpha/sports/(.*) 
> http://www.example.com/sports/$1 [R=301,L] if I really 
> understood regular [sic] expressions.
>
>  
>
> Nancy E. Wichmann, PMP  
>
> Injustice anywhere is a threat to justice everywhere. -- Dr. Martin L. 
> King, Jr.
>
>  
>
> *From:* development-bounces at drupal.org 
> [mailto:development-bounces at drupal.org] *On Behalf Of *Seth Freach
> *Sent:* Tuesday, November 10, 2009 11:26 AM
> *To:* development at drupal.org
> *Subject:* Re: [development] Can .htaccess discard part of a path?
>
>  
>
> Nancy,
>
> I'm assuming this is a leftover from the moms team site?  The incoming 
> requests are coming from the fact that Google appears to have lots of 
> these links in its index still to these URLs and sites which still 
> link to these URLs.
>
> Instead of a rewrite, I'd suggest a a response code 301 redirect.  
> This will be more Google friendly.
>
> look in the default .htaccess file for the (commented out by default) 
> lines that deal with www. redirection (ie, you always want people to 
> see "www" or never do, regardless of how they access the site.)  Using 
> those patterns should help show you how to redirect to the same 
> content but without the "cgi-bin/printOriginal.pl&file=/"
>
> Seth
>
>
> Nancy Wichmann wrote:
>
> I am getting lots of requests like this:
>
> http://www.example.com/index.php?q=cgi-bin/printOriginal.pl&file=/alpha/beta/gamma/rage_prevention.shtml 
> <http://www.example.com/index.php?q=cgi-bin/printOriginal.pl&file=/alpha/beta/gamma/rage_prevention.shtml>
>
> The file argument is a valid page on our old site and is itself 
> redirected with a ReWriteRule in .htaccess. However, 
> cgi-bin/printOriginal.pl does not exist and I have no idea what it was 
> supposed to do (well, I can guess print the page). We get lots of 
> these requests for different pages. I have tried a simple rewrite rule 
> and a URL alias to prevent the 404 processing, but neither has fixed it.
>
> Is it possible to design a rewriterule that essentially discards the 
> "cgi-bin/printOriginal.pl" and just serves up the requested page 
> (well, after its own rewrite rule has worked)? So this would become
>
> http://www.example.com/index.php/alpha/beta/gamma/rage_prevention.shtml
>
>  
>
>  
>
> Nancy E. Wichmann, PMP
>
> Injustice anywhere is a threat to justice everywhere. -- Dr. Martin L. 
> King, Jr.
>
>  
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 9.0.704 / Virus Database: 270.14.59/2494 - Release Date: 
> 11/10/09 02:38:00
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.drupal.org/pipermail/development/attachments/20091110/a75bf065/attachment.html 


More information about the development mailing list