September 10th, 2005, 12:58 PM
SHA and MD5
I'm just wondering as to what the purpose of someone using SHA or MD5. If I want to send encrypted data from a client (made by me) to a server (also made by me), how can the server possibly read the data it receives? I seems there is no way to decrypt the data. Am I doing something wrong or do I misunderstand this concept?
September 10th, 2005, 02:27 PM
SHA and MD5 aren't encrypting the data, they are hashing it.
As Python has a builtin hash function, I will quote the help for clarity:
The point being that you run some data through a hash, and get a unique*, but repeatable, small value out the other side. No matter how big the original data, 10 lines, 100Mb, 23Gb, the MD5 hash will always be the same size.
Help on built-in function hash in module __builtin__:
hash(object) -> integer
Return a hash value for the object. Two objects with the same value have the same hash value. The reverse is not necessarily true, but likely.
That way, if you hash the data again, you get the same answer out.
A slight change, and the hash is completely different.
>>> import md5
Say someone hosts a 500Mb file on the web, and also puts the MD5 hash of the file there with it.
I can download the file, then run it through MD5 on my own computer. If the hash comes out the same - the data is identical.
If the hash is different, the data is different - maybe corrupted by the download, maybe tampered with by a hacker, maybe an older version, maybe I have a disk error, etc.
You can use the hash to say that two items are the same, or have not changed. But you can't get the data out of the hash.
I believe Python dictionaries and other internals use it to track if two things are the same, so they can compare complex things very quickly. E.g. instead of testing 1000 characters in two strings to see if they are the same, it does two hashes (very quickly) and checks if those are the same.
*The values aren't guaranteed to be unique, just statistically very, very likely. but if you put two different bits of data in and get the same hash out both times, that's known as a hash collision, and is a bad thing.
In fact, looking around to see how likely this is, I found this - it's so unlikely that it's worthy of writing a paper about when someone manages to do it.
That paper has this to say:
So, if you can get the data out again, it's not officially a working hash algorithm anymore.
If you want a really interesting read on the basics of encryption and how it has got more powerful through history, I recommend "The Code Book" by Simon Singh. (It even has 5 stars at Amazon).
Last edited by sfb; September 10th, 2005 at 02:33 PM.
September 10th, 2005, 02:58 PM
Ah. I see. So for sending data such as passwords it would be better to simply encrypt it? Once again thanks for the help sfb.
September 10th, 2005, 04:38 PM
Actually, for passwords, the best thing to do is not to send the password at all. Here's how it works.
1. When you enter the password for the first time, the computer generates a hash from your password and only stores the hash.
2. When you enter a password to log in, the computer hashes whatever you typed in and compares it with the hash that it already has stored. If they match, it lets you in. If not, it prompts you to try again.
The nice thing about this approach is that no one can tell what your password is by examining the password file (not even the system administrator) or even by sniffing the packets, because the password is encrypted before being sent.
Up the Irons
What Would Jimi Do? Smash amps. Burn guitar. Take the groupies home.
"Death Before Dishonour, my Friends!!" - Bruce D ickinson, Iron Maiden Aug 20, 2005 @ OzzFest
Down with Sharon Osbourne
"I wouldn't hire a butcher to fix my car. I also wouldn't hire a marketing firm to build my website." - Nilpo
September 10th, 2005, 05:16 PM
September 10th, 2005, 05:49 PM
That's what I was thinking about. However I wanted to still have access to all accounts' passwords. Not that I need them for personal use, but incase a user were to forget his/her password, it would be able to be retrieved. I guess I'll just do what you suggested. There are other methods in resetting an accounts password. Thanks.
Originally Posted by Scorpions4ever
September 11th, 2005, 12:15 AM
I would highly recommend hashing of a password and not stroing in plain text or some sort of common encryption between the passwords. Reason being because if for some reason someone is able to get ahold of all account users infromation then they have there passwords as well. A hash of someones password pretty close to usuless except for a brute force attack, but that is secured from by the user with using good passwords.
For lost passwords, i recommend some sort of protocal involving them asking for a new one and you sending to email, possible ask for a question or something. Don't activate the password till something hapens, like the user emails you back or goes to a web site and says hey i want to activate it. This will keep annoying 12 year olds from making life hard for you.
Its just always better off when only the account holder knows the password, and hash functions make it possible for only them to know not even the administrator.
Comments on this post
September 11th, 2005, 12:06 PM
That is what I will most likely end up doing. However I may use another method of password resetting other than having them answer a private question. If I can think of one that is.
September 14th, 2005, 12:56 AM
yes, some reset a password by sending an email link, they click on the link, user goes to a secure site where they (possibly: enter their username or email) re-enter password, confirm password, and that resets the hash.
this avoids the possibility as well of putting the password in a plain text email since no one encrypts it. On your end you would only need to put the user's username and email address on file with the hash.