#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2009
    Location
    Tropical Paradise
    Posts
    3
    Rep Power
    0

    Advice on using rewrite for "GET / HTTP/1.0" abuse


    I'm usually a lurker and reader here, now have something I needs some help and advice with. On a WP site I am finding hundreds of self generated "backlinks" using "GET / HTTP/1.0" and an URL. I have looked all over online but am probably looking for the wrong thing.

    The best idea I have found that looks like it might at least force a 404 (?) is this:
    Code:
    RewriteCond %{REQUEST_FILENAME} !-f 
    RewriteCond %{REQUEST_FILENAME} !-d 
    RewriteRule (.*)$ http://www.mysite.com/$1/ [R=301,L]
    Has someone here solved this problem with a rewrite like this? Would it do what I am trying to do - stop the self generated links to sites I have never even visited? It is not helping my site.
    Thank you.
  2. #2
  3. mod_dev_shed
    Devshed Supreme Being (6500+ posts)

    Join Date
    Sep 2002
    Location
    Atlanta, GA
    Posts
    14,817
    Rep Power
    1099
    I don't understand
    On a WP site I am finding hundreds of self generated "backlinks" using "GET / HTTP/1.0" and an URL.
    Explain this a bit more, preferably with an example (you can't assume we know WordPress).
    # Jeremy

    Explain your problem instead of asking how to do what you decided was the solution.
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2009
    Location
    Tropical Paradise
    Posts
    3
    Rep Power
    0
    Originally Posted by jharnois
    I don't understandExplain this a bit more, preferably with an example (you can't assume we know WordPress).
    It took some time, but I resolved the problem. Basically it involves low level sites exploiting some of the features of WP in that it gets a pingback - an artificial backlink- when you enter an URL as a non-existing page on a WordPress site. There are sites selling backlink services that send bots to enter the URL that they want to appear as a backlink and it pings search engines as if it were an existing link. There are no actual links in either direction, but over time, enough hits will cause these low level sites to appear as "linked" to your site.

    here are examples from raw access logs (disguised URIs)
    178.137.129.xx - - [18/Feb/2012:15:09:55 -0600] "GET / HTTP/1.0" 200 7638 "crappy.siteURL.ru" "Mozilla/5.0 (Windows NT 5.1; U; en) Opera 8.00"
    178.137.129.xx - - [18/Feb/2012:15:10:01 -0600] "GET / HTTP/1.0" 200 7638 "crappy.siteURL.ru" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)"
    178.137.129.xx - - [18/Feb/2012:15:10:08 -0600] "GET / HTTP/1.0" 200 7638 "crappy.siteURL.ru" "Mozilla/4.0 (compatible; MSIE 6.0; Update a; AOL 6.0; Windows 98)"

    These appear to have different user-agents all from the same IP within seconds of each other, it is a bot with spoofed UAs.

    Maybe Google's recent devaluation or change in valuation of some backlinks will eventually stop the referrer spam deluge? I always believed it was harmless if useless until I started seeing these junk links as backlinks in Google Webmaster Tools recently.

    When I tried to ensure that a 404 was issued, the rewrite caused a server error and had to be removed. After two days of back and forth with the host, they fixed the problem without letting me know why a normal htaccess rewrite would cause 500 server errors. The same code that gave me 500 errors is working fine now so whatever it was above my site I can't tell you. But it is fixed, thanks.

IMN logo majestic logo threadwatch logo seochat tools logo