Apache Development
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsSystem AdministrationApache Development

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old June 18th, 2009, 02:11 PM
Clintonio Clintonio is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Mar 2009
Location: Britland
Posts: 36 Clintonio User rank is Second Lieutenant (5000 - 10000 Reputation Level)Clintonio User rank is Second Lieutenant (5000 - 10000 Reputation Level)Clintonio User rank is Second Lieutenant (5000 - 10000 Reputation Level)Clintonio User rank is Second Lieutenant (5000 - 10000 Reputation Level)Clintonio User rank is Second Lieutenant (5000 - 10000 Reputation Level)Clintonio User rank is Second Lieutenant (5000 - 10000 Reputation Level)Clintonio User rank is Second Lieutenant (5000 - 10000 Reputation Level) 
Time spent in forums: 9 h 37 m 15 sec
Reputation Power: 61
Blocking bots, should I bother?

In my .htaccess (soon to move to my httpd.conf) I have this massive bundle of rewrites, all dedicated to stopping unfriendly bots/ browsers viewing my site.

Code:
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
RewriteCond %{HTTP_USER_AGENT} ^Custo [OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR]
RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR]
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR]
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR]
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus
RewriteRule ^.* - [F,L]


My server has for a long time now had performance issues due to apache hits. Little processing occurs on my site most of the time (a small PHP script) most of it seems to the be something related to apache.

This isn't all I have in my .htaccess, there is about 200 more rewrites below that bit of code, most of them will be removed for being totally pointless and superfluous.

Should I bother blocking these bots? Or, as an image host, should I be concerned about bots?

Reply With Quote
  #2  
Old June 18th, 2009, 06:42 PM
E-Oreo's Avatar
E-Oreo E-Oreo is offline
Contributing User
Dev Shed Loyal (3000 - 3499 posts)
 
Join Date: Dec 2004
Posts: 3,227 E-Oreo User rank is General 16th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 16th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 16th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 16th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 16th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 16th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 16th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 16th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 16th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 16th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 16th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 16th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 16th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 16th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 16th Grade (Above 100000 Reputation Level)E-Oreo User rank is General 16th Grade (Above 100000 Reputation Level)  Folding Points: 945 Folding Title: Novice Folder
Time spent in forums: 3 Weeks 17 h 39 m 4 sec
Reputation Power: 2061
Why are you blocking them? Having all of those rewrite rules in there probably causes more processing overhead than just serving them the requested file.
Comments on this post
Clintonio agrees: Thanks, I really didn't think about it enough

Reply With Quote
  #3  
Old June 19th, 2009, 10:15 AM
Clintonio Clintonio is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Mar 2009
Location: Britland
Posts: 36 Clintonio User rank is Second Lieutenant (5000 - 10000 Reputation Level)Clintonio User rank is Second Lieutenant (5000 - 10000 Reputation Level)Clintonio User rank is Second Lieutenant (5000 - 10000 Reputation Level)Clintonio User rank is Second Lieutenant (5000 - 10000 Reputation Level)Clintonio User rank is Second Lieutenant (5000 - 10000 Reputation Level)Clintonio User rank is Second Lieutenant (5000 - 10000 Reputation Level)Clintonio User rank is Second Lieutenant (5000 - 10000 Reputation Level) 
Time spent in forums: 9 h 37 m 15 sec
Reputation Power: 61
Quote:
Originally Posted by E-Oreo
Why are you blocking them? Having all of those rewrite rules in there probably causes more processing overhead than just serving them the requested file.


A few years back I saw some site that claimed I should, at the time I didn't know any better and since haven't modified that section of the .htaccess for that site.

I'm in agreement with you here, it's probably just not worth the waste, given that I have millions of hits on that htaccess monthly, and I bet less than 0.1% are bots.

Gonna go remove it, I was seriously doubting its effectiveness.

Reply With Quote
Reply

Viewing: Dev Shed ForumsSystem AdministrationApache Development > Blocking bots, should I bother?


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump




 Free IT White Papers!
 
How to Present Effectively Online
This white paper offers practical and actionable advice on the key steps that any presenter should consider as they plan and execute a Webinar or online meeting.

 
Open Source Security Myths
Open Source Software (OSS) is computer software whose source code is available to the general public with relaxed or non-existent intellectual property restrictions (or arrangement such as the public domain), and is usually developed with the input of many contributors.

 
Power and Cooling Capacity Management for Data Centers
This paper describes the principles for achieving power and cooling capacity management.

 
Scalable, Fault-Tolerant NAS for Oracle - The Next Generation
For several years NAS has been evolving as a storage alternative for Oracle databases, and for good reason: NAS is quite often the simplest, most cost-effective storage approach for Oracle. Learn about the benefits that HP's approach to scalable NAS brings to Oracle environments in this comprehensive white paper.

 
Understanding Web Application Security Challenges
This white paper discusses many common threats and preventive measures for Web application security, and explains what you can do to help protect your organization.

 

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 




© 2003-2009 by Developer Shed. All rights reserved. DS Cluster 6 Hosted by Hostway
Stay green...Green IT