Thread: SHA and MD5

    #1
  1. Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Dec 2004
    Location
    Meriden, Connecticut
    Posts
    1,797
    Rep Power
    154

    SHA and MD5


    I'm just wondering as to what the purpose of someone using SHA or MD5. If I want to send encrypted data from a client (made by me) to a server (also made by me), how can the server possibly read the data it receives? I seems there is no way to decrypt the data. Am I doing something wrong or do I misunderstand this concept?
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Nov 2003
    Posts
    624
    Rep Power
    34
    SHA and MD5 aren't encrypting the data, they are hashing it.

    As Python has a builtin hash function, I will quote the help for clarity:

    Code:
    >>> help(hash)
    Help on built-in function hash in module __builtin__:
    
    hash(...)
        hash(object) -> integer
        
        Return a hash value for the object.  Two objects with the same value have the same hash value.  The reverse is not necessarily true, but likely.
    
    >>>
    The point being that you run some data through a hash, and get a unique*, but repeatable, small value out the other side. No matter how big the original data, 10 lines, 100Mb, 23Gb, the MD5 hash will always be the same size.

    That way, if you hash the data again, you get the same answer out.

    Code:
    >>> import md5
    >>> md5.new('abcdefg').hexdigest()
    '7ac66c0f148de9519b8bd264312c4d64'
    
    >>> md5.new('abcDefg').hexdigest()
    '12fa4d1e58529e56adce5e4d675f2b89'
    
    >>> md5.new('abcdefg').hexdigest()
    '7ac66c0f148de9519b8bd264312c4d64'
    >>>
    A slight change, and the hash is completely different.

    Say someone hosts a 500Mb file on the web, and also puts the MD5 hash of the file there with it.

    I can download the file, then run it through MD5 on my own computer. If the hash comes out the same - the data is identical.

    If the hash is different, the data is different - maybe corrupted by the download, maybe tampered with by a hacker, maybe an older version, maybe I have a disk error, etc.

    You can use the hash to say that two items are the same, or have not changed. But you can't get the data out of the hash.

    I believe Python dictionaries and other internals use it to track if two things are the same, so they can compare complex things very quickly. E.g. instead of testing 1000 characters in two strings to see if they are the same, it does two hashes (very quickly) and checks if those are the same.


    *The values aren't guaranteed to be unique, just statistically very, very likely. but if you put two different bits of data in and get the same hash out both times, that's known as a hash collision, and is a bad thing.

    In fact, looking around to see how likely this is, I found this - it's so unlikely that it's worthy of writing a paper about when someone manages to do it.

    That paper has this to say:

    Hash functions are one of the basic building blocks of modern cryptography. In cryptography hash functions are used for everything from password verification to digital signatures. A hash function has three fundamental properties:

    A hash function must be able to easily convert digital information (i.e. a message) into a fixed length hash value.
    It must be computationally infeasible to derive any information about the input message from just the hash.
    It must be computationally infeasible to find two files to have the same hash. Hash (Message 1) = Hash (Message 2).

    In computer forensics hash functions are important because they provide a means of identifying and classifying electronic evidence. Because hash functions play a critical role in evidence authentication it is critical a judge or jury can trust the hash values that uniquely identify electronic evidence.

    The third property of a hash function states that it must be computationally infeasible to find two files to have the same hash. The research published by Wang, Feng, Lai and Yu demonstrated that MD5 fails this third requirement since two different messages have been generated that have the same hash. This situation is called a collision
    So, if you can get the data out again, it's not officially a working hash algorithm anymore.

    If you want a really interesting read on the basics of encryption and how it has got more powerful through history, I recommend "The Code Book" by Simon Singh. (It even has 5 stars at Amazon).
    Last edited by sfb; September 10th, 2005 at 02:33 PM.
  4. #3
  5. Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Dec 2004
    Location
    Meriden, Connecticut
    Posts
    1,797
    Rep Power
    154
    Ah. I see. So for sending data such as passwords it would be better to simply encrypt it? Once again thanks for the help sfb.
  6. #4
  7. Banned ;)
    Devshed Supreme Being (6500+ posts)

    Join Date
    Nov 2001
    Location
    Woodland Hills, Los Angeles County, California, USA
    Posts
    9,638
    Rep Power
    4247
    Actually, for passwords, the best thing to do is not to send the password at all. Here's how it works.
    1. When you enter the password for the first time, the computer generates a hash from your password and only stores the hash.
    2. When you enter a password to log in, the computer hashes whatever you typed in and compares it with the hash that it already has stored. If they match, it lets you in. If not, it prompts you to try again.

    The nice thing about this approach is that no one can tell what your password is by examining the password file (not even the system administrator) or even by sniffing the packets, because the password is encrypted before being sent.
    Up the Irons
    What Would Jimi Do? Smash amps. Burn guitar. Take the groupies home.
    "Death Before Dishonour, my Friends!!" - Bruce D ickinson, Iron Maiden Aug 20, 2005 @ OzzFest
    Down with Sharon Osbourne

    "I wouldn't hire a butcher to fix my car. I also wouldn't hire a marketing firm to build my website." - Nilpo
  8. #5
  9. (retired)
    Devshed Supreme Being (6500+ posts)

    Join Date
    Dec 2003
    Location
    The Laboratory
    Posts
    10,101
    Rep Power
    0
    Have a look at this thread and this thread where we talk about hashing vs. encrypting.

    --Simon
  10. #6
  11. Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Dec 2004
    Location
    Meriden, Connecticut
    Posts
    1,797
    Rep Power
    154
    Originally Posted by Scorpions4ever
    Actually, for passwords, the best thing to do is not to send the password at all. Here's how it works.
    1. When you enter the password for the first time, the computer generates a hash from your password and only stores the hash.
    2. When you enter a password to log in, the computer hashes whatever you typed in and compares it with the hash that it already has stored. If they match, it lets you in. If not, it prompts you to try again.

    The nice thing about this approach is that no one can tell what your password is by examining the password file (not even the system administrator) or even by sniffing the packets, because the password is encrypted before being sent.
    That's what I was thinking about. However I wanted to still have access to all accounts' passwords. Not that I need them for personal use, but incase a user were to forget his/her password, it would be able to be retrieved. I guess I'll just do what you suggested. There are other methods in resetting an accounts password. Thanks.
  12. #7
  13. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2004
    Posts
    461
    Rep Power
    25
    I would highly recommend hashing of a password and not stroing in plain text or some sort of common encryption between the passwords. Reason being because if for some reason someone is able to get ahold of all account users infromation then they have there passwords as well. A hash of someones password pretty close to usuless except for a brute force attack, but that is secured from by the user with using good passwords.

    For lost passwords, i recommend some sort of protocal involving them asking for a new one and you sending to email, possible ask for a question or something. Don't activate the password till something hapens, like the user emails you back or goes to a web site and says hey i want to activate it. This will keep annoying 12 year olds from making life hard for you.

    Its just always better off when only the account holder knows the password, and hash functions make it possible for only them to know not even the administrator.

    Comments on this post

    • Yegg` agrees
  14. #8
  15. Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Dec 2004
    Location
    Meriden, Connecticut
    Posts
    1,797
    Rep Power
    154
    That is what I will most likely end up doing. However I may use another method of password resetting other than having them answer a private question. If I can think of one that is.
  16. #9
  17. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2005
    Posts
    174
    Rep Power
    11
    yes, some reset a password by sending an email link, they click on the link, user goes to a secure site where they (possibly: enter their username or email) re-enter password, confirm password, and that resets the hash.

    this avoids the possibility as well of putting the password in a plain text email since no one encrypts it. On your end you would only need to put the user's username and email address on file with the hash.

    cheers
    sf2k

IMN logo majestic logo threadwatch logo seochat tools logo