Page 1 of 2 12 Last
  • Jump to page:
    #1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2012
    Posts
    33
    Rep Power
    3

    Simple Spam Bot Blocker


    Hi I just did this today and would like some feedback on how to improve on this.
    I highly doubt any spam bot would ever get around this but maybe i can learn something more by getting feedback on this or maybe help others to implement a simple php spam bot program on their forms.
    I'm not taking credit for the password gen function which i found a long time ago.
    PHP Code:
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml">
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title>Spam Bot Blocker</title>
    </head>

    <body>

    <?php 
    include("login/include/session.php");

    if ((isset(
    $_POST['authenticate']))&&(!empty($_POST['authenticate'])))    {
    if (!empty(
    $_POST['HumanCode0'])){$HumanCode0 =safe($_POST['HumanCode0']);}else{$HumanCode0 "";}
    if (!empty(
    $_POST['HumanCode1'])){$HumanCode1 =safe($_POST['HumanCode1']);}else{$HumanCode1 "";}
    if (!empty(
    $_POST['HumanCode2'])){$HumanCode2 =safe($_POST['HumanCode2']);}else{$HumanCode2 "";}
    if (!empty(
    $_POST['HumanCode3'])){$HumanCode3 =safe($_POST['HumanCode3']);}else{$HumanCode3 "";}
    if (!empty(
    $_POST['HumanCode4'])){$HumanCode4 =safe($_POST['HumanCode4']);}else{$HumanCode4 "";}
    if (!empty(
    $_POST['HumanCode5'])){$HumanCode5 =safe($_POST['HumanCode5']);}else{$HumanCode5 "";}
    if (!empty(
    $_POST['HumanCode6'])){$HumanCode6 =safe($_POST['HumanCode6']);}else{$HumanCode6 "";}

    $HumanCodeCombined $HumanCode0.$HumanCode1.$HumanCode2.$HumanCode3.$HumanCode4.$HumanCode5.$HumanCode6;
    $SessionHumanCodeCombined $_SESSION['HumanCode0'].$_SESSION['HumanCode1'].$_SESSION['HumanCode2'].$_SESSION['HumanCode3'].$_SESSION['HumanCode4'].$_SESSION['HumanCode5'].$_SESSION['HumanCode6'];
    echo 
    "Posted code = ".$HumanCodeCombined."<br>"
    echo 
    "Session code = ".$SessionHumanCodeCombined."<br>"

    if (
    $HumanCodeCombined == $SessionHumanCodeCombined){echo "code match<br>";}
    if (
    $HumanCodeCombined != $SessionHumanCodeCombined){echo "code did not match<br>";}

    $_SESSION['HumanCode0'] = "";
    $_SESSION['HumanCode1'] = "";
    $_SESSION['HumanCode2'] = "";
    $_SESSION['HumanCode3'] = "";
    $_SESSION['HumanCode4'] = "";
    $_SESSION['HumanCode5'] = "";
    $_SESSION['HumanCode6'] = "";

    }
    ?>

    <form  action="" method="post" name="" />
          <table width="120" border="0" cellspacing="0" cellpadding="0">
    <?php 
            
    function generatePassword($length=5,$level=1){
       list(
    $usec$sec) = explode(' 'microtime());
       
    srand((float) $sec + ((float) $usec 100000));
       
    $validchars[1] = "0123456789abcdfghjkmnpqrstvwxyz";
       
    $validchars[2] = "0123456789abcdfghjkmnpqrstvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
       
    $validchars[3] = "0123456789_!@#$%&*()-=+/abcdfghjkmnpqrstvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_!@#$%&*()-=+/";
       
    $password  "";
       
    $counter   0;
       while (
    $counter $length) {
         
    $actChar substr($validchars[$level], rand(0strlen($validchars[$level])-1), 1);
         
    // All character must be different
         
    if (!strstr($password$actChar)) {
            
    $password .= $actChar;
            
    $counter++;
         }
       }
       return 
    $password;
    }

    // genearate random code
    $HumanCode generatePassword(7,2);

    // randomize the 7 variables to either be emtpy or not
    $HumanCode0active rand(0,1);        
    $HumanCode1active rand(0,1);        
    $HumanCode2active rand(0,1);        
    $HumanCode3active rand(0,1);        
    $HumanCode4active rand(0,1);        
    $HumanCode5active rand(0,1);        
    $HumanCode6active rand(0,1);
    ?>
                <tr align="center" valign="middle">
                  <td height="20"><?php echo $HumanCode[0];?></td>
                  <td><?php echo $HumanCode[1];?></td>
                  <td><?php echo $HumanCode[2];?></td>
                  <td><?php echo $HumanCode[3];?></td>
                  <td><?php echo $HumanCode[4];?></td>
                  <td><?php echo $HumanCode[5];?></td>
                  <td><?php echo $HumanCode[6];?></td>
                </tr>
                <tr align="center" valign="middle" >
                  <td height="20"><?php if (!empty($HumanCode0active)) {?><input type="text" name="HumanCode0" style="width:10px" value=""/> <?php $_SESSION['HumanCode0'] = $HumanCode[0]; } else { echo $HumanCode[0]; $_SESSION['HumanCode0'] = "";}?> </td>
                  <td><?php if (!empty($HumanCode1active)) {?><input type="text" name="HumanCode1" style="width:10px" value=""/> <?php $_SESSION['HumanCode1'] = $HumanCode[1]; } else { echo $HumanCode[1]; $_SESSION['HumanCode1'] = "";}?> </td>
                  <td><?php if (!empty($HumanCode2active)) {?><input type="text" name="HumanCode2" style="width:10px" value=""/> <?php $_SESSION['HumanCode2'] = $HumanCode[2]; } else { echo $HumanCode[2]; $_SESSION['HumanCode2'] = "";}?> </td>
                  <td><?php if (!empty($HumanCode3active)) {?><input type="text" name="HumanCode3" style="width:10px" value=""/> <?php $_SESSION['HumanCode3'] = $HumanCode[3]; } else { echo $HumanCode[3]; $_SESSION['HumanCode3'] = "";}?> </td>
                  <td><?php if (!empty($HumanCode4active)) {?><input type="text" name="HumanCode4" style="width:10px" value=""/> <?php $_SESSION['HumanCode4'] = $HumanCode[4]; } else { echo $HumanCode[4]; $_SESSION['HumanCode4'] = "";}?> </td>
                  <td><?php if (!empty($HumanCode5active)) {?><input type="text" name="HumanCode5" style="width:10px" value=""/> <?php $_SESSION['HumanCode5'] = $HumanCode[5]; } else { echo $HumanCode[5]; $_SESSION['HumanCode5'] = "";}?> </td>
                  <td><?php if (!empty($HumanCode6active)) {?><input type="text" name="HumanCode6" style="width:10px" value=""/> <?php $_SESSION['HumanCode6'] = $HumanCode[6]; } else { echo $HumanCode[6]; $_SESSION['HumanCode6'] = "";}?> </td>
                </tr>
              </table>
               
          <input name="authenticate" type="submit" value="Authenticate" />
    </form>

    </body>
    </html>
    I'm using mysql real escape for all my post hence the safe () around the post variables.
    The function is inside session.php as seen below
    PHP Code:
    function safe($value){
       return 
    mysql_real_escape_string($value);

  2. #2
  3. --
    Devshed Expert (3500 - 3999 posts)

    Join Date
    Jul 2012
    Posts
    3,959
    Rep Power
    1014
    Hi,

    interesting approach. Unfortunately, it has the same problem that many home-made CAPTCHAs have: spam bots are better at solving this than humans. Reading two table rows and filling out all input fields with the values of the row above is trivial for a machine. A computer will solve this before the human brain has even understood the task.

    You could actually write your own CAPTCHA breaker. All you need is an HTML parser like DOMDocument.

    Apart from this, there are several issues:

    • You need to learn about loops. Repeating actions isn't done by copying and pasting the code.
    • Don't use code you found somewhere on the Internet "a long time ago". Most of the PHP code you'll find online is garbage and hopelessly outdated. This "generatePassword" function is bloated and very, very weak, because it uses the current time (which everybody knows) as a starting point and then uses a deterministic algorithm to calculate a bunch of pseudo-random numbers. So it's easy to predict the whole sequence.
    • Your "safe" function makes no sense. As the name "mysql_real_escape_string" already says, this function is for a MySQL database. You don't have anything like that in your code. What you want is an HTML escaping function: htmlspecialchars()

    And if you want actual protection against spam bots, of course you shouldn't try to invent your own CAPTCHA. There are proven solutions like reCAPTCHA, which are better than any home-made riddle will ever be.

    Comments on this post

    • Strider64 agrees : I use reCaptcha and is very easy to incorporate into the code.
    The 6 worst sins of security ē How to (properly) access a MySQL database with PHP

    Why canít I use certain words like "drop" as part of my Security Question answers?
    There are certain words used by hackers to try to gain access to systems and manipulate data; therefore, the following words are restricted: "select," "delete," "update," "insert," "drop" and "null".
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2012
    Posts
    33
    Rep Power
    3
    Hi
    If you run the script then you'll see that i randomize the input boxes.
    I would like to know how a bot will get around understanding what to fill in where.

    It will definitely deter most of the bots filling in my forms.

    I was trying to search about loops and repeating actions,
    You are probably referring to the capabilities of the spam bots.
    Maybe you can link an article.

    Regarding safe.
    I said the function is inside the included session.php
    So safe() can be removed from my code.

    I know about reCAPTCHA , would like to create my own version.
  6. #4
  7. No Profile Picture
    Lost in code
    Devshed Supreme Being (6500+ posts)

    Join Date
    Dec 2004
    Posts
    8,317
    Rep Power
    7170
    As Jacques1's already mentioned, writing a bot specifically to attack that code would be almost trivial.

    There are several different levels of bot protection that you can use on a site. In some cases, writing your own bot protection code does actually make sense (but not really writing your own CAPTCHA). Bot protection is generally a balance of usability vs protection. The problem is that it's hard to use a computer to generate a question that is easy for a human to answer but hard for a computer to answer.

    You can generally classify spam bots into three categories:
    1. Bots programmed specifically to post on your site
    2. Bots programmed specifically to post on a certain class of site (ex: WordPress, phpBB, etc.)
    3. Bots programmed to try to post into any form

    If you run a website, you will eventually get hit by 3; it's pretty much inevitable. If you run a site that uses standard technology you'll eventually get hit by 2 also. However, unless you run a very large site, the chances of someone bothering to invest time in writing a bot specifically to attack your site is pretty low, so you probably won't get hit by 1.

    You can also generally classify bot protection methods into categories:
    (a) no protection against bots
    (b) simple protection - easy for both humans and bots to solve
    (c) complicated protection - difficult for both humans and bots to solve

    Ideally there would be a "difficult for bots to solve but easy for humans to solve" option, but in practice there isn't a universal solution that is easy for all humans and difficult for all bots.

    To protect against bots in categories 2 or 3, I recommend rolling a custom solution in the vein of (b). This should be sufficient for nearly all small sites. Rolling a custom solution for this only takes a few lines of code and is actually better than using a standard solution for it, because bots of type 2 might be programmed specifically to attack a standard solution.

    What you have almost falls into this category already, except that I would classify your solution as difficult for humans to solve and easy for bots to solve because you're asking them to enter 6 separate values. You may as well just ask the human to enter a single value rather than 6. Only a bot of type 1 could solve this puzzle, and if someone is going to write a bot specifically to attack your site it is literally just as easy for them to answer 6 random values as it is for them to answer 1 random value. Also for the same reason you don't need a complicated random password generation method; just using rand(0,100) to generate the random number will have the same effectiveness.

    If your site is being hit by more advanced bots, then you should implement a pre-existing solution of type (c), like reCAPTCHA. Rolling your own type (c) solution is rarely worthwhile, particularly one which is actually accessible to disabled users.

    Comments on this post

    • NotionCommotion agrees : Nice detail
    PHP FAQ

    Originally Posted by Spad
    Ah USB, the only rectangular connector where you have to make 3 attempts before you get it the right way around
  8. #5
  9. --
    Devshed Expert (3500 - 3999 posts)

    Join Date
    Jul 2012
    Posts
    3,959
    Rep Power
    1014
    Originally Posted by jpmul
    Hi
    If you run the script then you'll see that i randomize the input boxes.
    I would like to know how a bot will get around understanding what to fill in where.
    The bot does the same thing a human would do: It walks through the second row, and whenever it encounters an input element, it fills in the value from the row above.

    As pseudo code:

    Code:
    values := parse(first_row)
    index := 0
    for table_field in parse(second_row):
    	child_element := child(table_field)
    	if type(child_element) = input:
    		value(child_element) := values[index]
    	index := index + 1
    Like I already said, bots are much better at solving simple tasks like this than humans will ever be. A bot can parse a HTML document and fill out a bunch of input elements in a few microseconds. A human needs time to understand the task, apprehend the visual structure, type in the characters etc.

    So this challenge is relatively difficult for humans (especially if they're disabled), but easy for machines.



    Originally Posted by jpmul
    I was trying to search about loops and repeating actions,
    You are probably referring to the capabilities of the spam bots.
    No, I'm talking about programming basics:

    http://www.php.net/manual/en/languag...structures.php

    PHP has four types of loops: "while", "do-while" (rarely used), "for" and "foreach".



    Originally Posted by jpmul
    Regarding safe.
    I said the function is inside the included session.php
    So safe() can be removed from my code.
    You shouldn't remove it but rather replace it with the correct function.



    Originally Posted by jpmul
    I know about reCAPTCHA , would like to create my own version.
    Sure. This is great for learning, but it won't block anybody but human users.
    The 6 worst sins of security ē How to (properly) access a MySQL database with PHP

    Why canít I use certain words like "drop" as part of my Security Question answers?
    There are certain words used by hackers to try to gain access to systems and manipulate data; therefore, the following words are restricted: "select," "delete," "update," "insert," "drop" and "null".
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2012
    Posts
    33
    Rep Power
    3
    Hi Jacques1 and E-Oreo
    Thank you for your detailed answers.
    I really appreciate it. And the information is very helpful.
    Sorry I didn't actually mean writing my own CAPTCHA

    Seeing that a bot is so clever at filling in forms so quickly
    Then a sure way would be to see how long from page requested to page submitted from user/bot.

    If milliseconds or under 5 seconds then reject Submit
    Else except the form if over 5 seconds.
    Any thoughts on this please
  12. #7
  13. --
    Devshed Expert (3500 - 3999 posts)

    Join Date
    Jul 2012
    Posts
    3,959
    Rep Power
    1014
    Originally Posted by jpmul
    If milliseconds or under 5 seconds then reject Submit
    Else except the form if over 5 seconds.
    Any thoughts on this please
    In my opinion, this is way too much effort for a bit of spam protection. You'd have to measure the exact time, store it in the database or a session, figure out a sensible time limit etc. And even then it won't be reliable, because a bot with a slow network connection won't be distinguishable from a human.

    Don't try to outsmart the spammers. Use hidden fields and possibly reCAPTCHA, and that's it. I think trying to come up with your own perfect spam protection is just a waste of time.
    The 6 worst sins of security ē How to (properly) access a MySQL database with PHP

    Why canít I use certain words like "drop" as part of my Security Question answers?
    There are certain words used by hackers to try to gain access to systems and manipulate data; therefore, the following words are restricted: "select," "delete," "update," "insert," "drop" and "null".
  14. #8
  15. No Profile Picture
    Lost in code
    Devshed Supreme Being (6500+ posts)

    Join Date
    Dec 2004
    Posts
    8,317
    Rep Power
    7170
    Seeing that a bot is so clever at filling in forms so quickly
    Then a sure way would be to see how long from page requested to page submitted from user/bot.

    If milliseconds or under 5 seconds then reject Submit
    Else except the form if over 5 seconds.
    Any thoughts on this please
    The bot programmer can just add an artificial delay of 5 seconds.

    Comments on this post

    • Jacques1 agrees
    PHP FAQ

    Originally Posted by Spad
    Ah USB, the only rectangular connector where you have to make 3 attempts before you get it the right way around
  16. #9
  17. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2012
    Posts
    33
    Rep Power
    3
    Hi
    Thanks I've taken your advice and discarding my method.
    Hidden input fields and checking time between page open and submit time seems to be good enough for me at this stage.

    Maybe I can create a few basic images of my own and ask the user to type what they see in the picture, with a hint which would direct them what to type in the box. Examples like "Glass" , "Tree", "House", "Fork", "Chair" to mention just a few
  18. #10
  19. --
    Devshed Expert (3500 - 3999 posts)

    Join Date
    Jul 2012
    Posts
    3,959
    Rep Power
    1014
    Originally Posted by jpmul
    Hi
    Thanks I've taken your advice and discarding my method.
    Hidden input fields and checking time between page open and submit time seems to be good enough for me at this stage.
    Um, didn't you just say you discarded this idea?

    I mean, you're free to do whatever you want. If you insist on making your own CAPTCHA, go ahead. But if you want our advice (which I thought you're here for), then stop trying to reinvent the wheel and use a simple, proven solution -- like reCAPTCHA or any equivalent service. Installing it takes only a few minutes, and it actually works. No need to come up with your own distorted pictures and puzzles and whatnot.
    The 6 worst sins of security ē How to (properly) access a MySQL database with PHP

    Why canít I use certain words like "drop" as part of my Security Question answers?
    There are certain words used by hackers to try to gain access to systems and manipulate data; therefore, the following words are restricted: "select," "delete," "update," "insert," "drop" and "null".
  20. #11
  21. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2012
    Posts
    33
    Rep Power
    3
    Ok thanks noted
  22. #12
  23. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Posts
    12
    Rep Power
    0
    1. You underestimate what modern bot's can do
    2. If you write your own method - you will filter 99,99% of them, but if anyone try to write specifical bot for your site - your defence will fall.

    That's means that you can use your method for educational purposes or on little projects, but:

    1. Dont use it on commercial projects.
    2. Dont use it on many projects (unless you want to feel like the firefighter)

    Comments on this post

    • NotionCommotion agrees : use your method for educational purposes
  24. #13
  25. No Profile Picture
    Contributing User
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Dec 2004
    Posts
    2,996
    Rep Power
    375
    Originally Posted by E-Oreo
    The bot programmer can just add an artificial delay of 5 seconds.
    how would the BOT know to add an artificial delay? do the owner/creator of bots log each action of the bot? do bot report back if they couldn't successfully post the form?


    Are bots capable of "understanding" all of your approaches? for example:

    1. asking a question: What is our planet callled?
    2. asking a maths question: 2+4
    3. asking to visit a page on the site and then ask a question around that page..
    4. detecting whether submit button was clicked too fast.

    Now sure the bot might be able to "crack" all these but how would it know what to do when he first sees your form?
  26. #14
  27. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Posts
    12
    Rep Power
    0
    Are bots capable of "understanding" all of your approaches? for example:
    Again, if anyone try to write specifical bot for your site - your defence will fall.

    But if you use RECaptcha for example. Even very motivated hacker will face the big problems.
  28. #15
  29. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2012
    Posts
    194
    Rep Power
    77
    Originally Posted by paulh1983
    how would the BOT know to add an artificial delay? do the owner/creator of bots log each action of the bot? do bot report back if they couldn't successfully post the form?


    Are bots capable of "understanding" all of your approaches? for example:

    1. asking a question: What is our planet callled?
    2. asking a maths question: 2+4
    3. asking to visit a page on the site and then ask a question around that page..
    4. detecting whether submit button was clicked too fast.

    Now sure the bot might be able to "crack" all these but how would it know what to do when he first sees your form?
    So lets say I send my bot to your website and it encounters your captcha, the first step I would have it is read the captchas question and if its math then do the math and submit, if it fails then second is search for any hidden fields and read the data then paste it in the captcha based upon the best result of the captchas question, third is to detect if it the captcha has a timed delay timer and delay the bot by 6 seconds, and fourth is to have the bot google up the question and submit the best result to the captcha

    All this can be done in milliseconds based on Internet speed and the response speed of the website

    Of course this is just a few simple methods of beating the captcha and my own real bot which I made for fun, is much better then this
Page 1 of 2 12 Last
  • Jump to page:

IMN logo majestic logo threadwatch logo seochat tools logo