Security and Cryptography
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
IBM developerWorks
Go Back   Dev Shed ForumsSystem AdministrationSecurity and Cryptography

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
You eat, breathe and sleep innovation. Build your mobile intelligence with BlackBerry® experts this July. Register Today!
  #1  
Old March 21st, 2008, 11:23 PM
twoblink twoblink is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: May 2001
Posts: 3 twoblink User rank is Lance Corporal (50 - 100 Reputation Level)twoblink User rank is Lance Corporal (50 - 100 Reputation Level)twoblink User rank is Lance Corporal (50 - 100 Reputation Level) 
Time spent in forums: 19 m 15 sec
Reputation Power: 0
Crypto Algorithm Question - Using SHA1 for file-uniqueness check still ok?

I have a website with pictures and I'd like to SHA1 the pic files and use that as the filename for that picture.

Now, the goal is just for unique identification, and while SHA1 seems broken academically, I'm not trying to use it for signatures.. I just want a reasonable guarantee that no two pictures in my photo album will SHA1 to the same value.

Even at 160 bits and 80 bits being collision free, 2^80 is still a large number of photos, and I am not sure I can imagine two photos that hash to the same value while looking different.

I would pick something bigger like SHA512, but then the filename would get too long to manage.

Now I am thinking I'd pick SHA1 over RIPEMD-160 simply because I have more guarantees of compatibility as well as rightness of implementation.

In fact, I'd be happy with 128 bits, as that'd yield a filename that's only 16 bytes long instead of 20..

Just wanted to get some opinions..

Reply With Quote
  #2  
Old March 22nd, 2008, 10:04 AM
fishtoprecords's Avatar
fishtoprecords fishtoprecords is offline
Contributing User
Click here for more information.
 
Join Date: Sep 2007
Location: outside Washington DC
Posts: 897 fishtoprecords User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)fishtoprecords User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)fishtoprecords User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)fishtoprecords User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)fishtoprecords User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)fishtoprecords User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)fishtoprecords User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)fishtoprecords User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)fishtoprecords User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)fishtoprecords User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)fishtoprecords User rank is Lieutenant Colonel (40000 - 50000 Reputation Level) 
Time spent in forums: 1 Week 2 Days 22 h 38 m 54 sec
Reputation Power: 416
for what you describe, SHA1 or even MD5 is fine.

Consider that happens if you get a collision? You disallow a photo or two? How likely is that? very low.

Reply With Quote
  #3  
Old March 22nd, 2008, 10:15 AM
execute's Avatar
execute execute is offline
CSKA Sofia
Dev Shed Novice (500 - 999 posts)
 
Join Date: Apr 2003
Location: Germany / Bulgaria
Posts: 562 execute User rank is First Lieutenant (10000 - 20000 Reputation Level)execute User rank is First Lieutenant (10000 - 20000 Reputation Level)execute User rank is First Lieutenant (10000 - 20000 Reputation Level)execute User rank is First Lieutenant (10000 - 20000 Reputation Level)execute User rank is First Lieutenant (10000 - 20000 Reputation Level)execute User rank is First Lieutenant (10000 - 20000 Reputation Level)execute User rank is First Lieutenant (10000 - 20000 Reputation Level)execute User rank is First Lieutenant (10000 - 20000 Reputation Level) 
Time spent in forums: 1 Week 3 Days 6 h 41 m 46 sec
Reputation Power: 117
Send a message via ICQ to execute
For the sore purpose of unique file names, you don't need to use a hash function rather a unique one. If you have PHP as a server side language, you may use the uniqid function, which output depends on milliseconds. I'm pretty confident there are similar functions in all other server-side languages.
__________________
Nikola Ivanov
http://weboholic.de

Reply With Quote
  #4  
Old March 22nd, 2008, 12:40 PM
twoblink twoblink is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: May 2001
Posts: 3 twoblink User rank is Lance Corporal (50 - 100 Reputation Level)twoblink User rank is Lance Corporal (50 - 100 Reputation Level)twoblink User rank is Lance Corporal (50 - 100 Reputation Level) 
Time spent in forums: 19 m 15 sec
Reputation Power: 0
Well..

Actually, I don't want to use something like uniqid, because I want the ID to be regenerateable from nothing but the file.

Thus, the hash of the file.

If the database is thrashed, having the file, I can regen the indexes..

Unique filenames aren't the goal, primary keys in the database that correspond to the files themselves are.

Right now, most of the galleries out there have a disconnect between the relationship (in the database) and the action item you are relating (the files).

I'm trying to cheat to bridge this gap.
Comments on this post
execute agrees!

Reply With Quote
  #5  
Old March 22nd, 2008, 12:45 PM
fishtoprecords's Avatar
fishtoprecords fishtoprecords is offline
Contributing User
Click here for more information.
 
Join Date: Sep 2007
Location: outside Washington DC
Posts: 897 fishtoprecords User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)fishtoprecords User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)fishtoprecords User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)fishtoprecords User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)fishtoprecords User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)fishtoprecords User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)fishtoprecords User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)fishtoprecords User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)fishtoprecords User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)fishtoprecords User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)fishtoprecords User rank is Lieutenant Colonel (40000 - 50000 Reputation Level) 
Time spent in forums: 1 Week 2 Days 22 h 38 m 54 sec
Reputation Power: 416
I think you are on the right track. You'll handle 'identical' files well.

There are interesting philosophical questions about 'near identical' files. a Crypto hash will yield radically different values for tiny changes.

A little cropping changes the hash, but not the picture that human eyeballs interpret. A little softening of the image, or pushing the color, and its totally different. Except its not.

Reply With Quote
  #6  
Old March 22nd, 2008, 07:12 PM
twoblink twoblink is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: May 2001
Posts: 3 twoblink User rank is Lance Corporal (50 - 100 Reputation Level)twoblink User rank is Lance Corporal (50 - 100 Reputation Level)twoblink User rank is Lance Corporal (50 - 100 Reputation Level) 
Time spent in forums: 19 m 15 sec
Reputation Power: 0
Absolutely..

HASH will give me a boolean, either it's EXACTLY the same, or not the same. Off by one pixel, is off by one byte or more, which will give me a radically different hash.

This will prevent file redundancy, as I'll have a db of pointers pointing to the same file, only need to keep one copy instead of keeping a dozen of the exact same photo.
Comments on this post
fishtoprecords agrees!

Reply With Quote
  #7  
Old March 23rd, 2008, 06:25 AM
execute's Avatar
execute execute is offline
CSKA Sofia
Dev Shed Novice (500 - 999 posts)
 
Join Date: Apr 2003
Location: Germany / Bulgaria
Posts: 562 execute User rank is First Lieutenant (10000 - 20000 Reputation Level)execute User rank is First Lieutenant (10000 - 20000 Reputation Level)execute User rank is First Lieutenant (10000 - 20000 Reputation Level)execute User rank is First Lieutenant (10000 - 20000 Reputation Level)execute User rank is First Lieutenant (10000 - 20000 Reputation Level)execute User rank is First Lieutenant (10000 - 20000 Reputation Level)execute User rank is First Lieutenant (10000 - 20000 Reputation Level)execute User rank is First Lieutenant (10000 - 20000 Reputation Level) 
Time spent in forums: 1 Week 3 Days 6 h 41 m 46 sec
Reputation Power: 117
Send a message via ICQ to execute
Yes, I agree. Having a unique name depending on file content is certainly nice and bridges the gap to the DB.

Using the birthday paradox and google you'll need about 5.05693754 * 10^9 pictures to have 50% chance of collision with MD5.

Reply With Quote
  #8  
Old March 24th, 2008, 09:10 PM
B-Con's Avatar
B-Con B-Con is online now
Crypto-Con
Dev Shed God 4th Plane (6500 - 6999 posts)
 
Join Date: Apr 2004
Location: UC Davis
Posts: 6,633 B-Con User rank is Major General (70000 - 90000 Reputation Level)B-Con User rank is Major General (70000 - 90000 Reputation Level)B-Con User rank is Major General (70000 - 90000 Reputation Level)B-Con User rank is Major General (70000 - 90000 Reputation Level)B-Con User rank is Major General (70000 - 90000 Reputation Level)B-Con User rank is Major General (70000 - 90000 Reputation Level)B-Con User rank is Major General (70000 - 90000 Reputation Level)B-Con User rank is Major General (70000 - 90000 Reputation Level)B-Con User rank is Major General (70000 - 90000 Reputation Level)B-Con User rank is Major General (70000 - 90000 Reputation Level)B-Con User rank is Major General (70000 - 90000 Reputation Level)B-Con User rank is Major General (70000 - 90000 Reputation Level)B-Con User rank is Major General (70000 - 90000 Reputation Level)B-Con User rank is Major General (70000 - 90000 Reputation Level) 
Time spent in forums: 1 Month 5 Days 16 h 3 m 52 sec
Reputation Power: 762
This concept has been utilized before. phpbb (at least for version 2, not sure about 3) uses md5's of user's avatars as the image's filename.
__________________
- "Cryptographically secure linear feedback shift register based stream ciphers" -- a phrase that'll get any party started.
- Why know the ordinary when you can understand the extraordinary?


- Sponsor my caffeine addiction! (36.70 USD recieved so far -- Latest donor: Mark Foxvog
)

Reply With Quote
Reply

Viewing: Dev Shed ForumsSystem AdministrationSecurity and Cryptography > Crypto Algorithm Question - Using SHA1 for file-uniqueness check still ok?


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 2 hosted by Hostway