I'd like to discuss a key stretching algorithm for symmetric encryption for secure file exchange with pre-shared keywords.
My idea is to use an intermediate key and two different salt where the final key hash will be computed in two separate hashing sessions but additionally inside each hashing session the salt will be altered also by a special function, in the example Python listing below just two simple byte swap of hash values followed by a bit-wise exclusive or calculation in order to generate a "rolling" salt.
Secondly I introduce an additional verification hash (V_h) so that we would have an early notice if the ciphertext was tampered somehow or the keyword was wrong if V_h and V_h' don't match and we can stop any further decryption at this point.
Finally the plaintext authentication hash (P_h) is included within the encrypted message and unknown until the complete decryption process was successful.
Based on common knowledge about salting passwords and due to the second salt, the intermediate key, the excessive use of cascaded hashing and the "rolling" salt the finally computed derived key should be quite strong if a good pre-shared key as input is chosen, which I consider should at least have 40 bit of entropy.
Code:
==========
(Notation)
==========
H Hash Function (SHA-256)
E Encrypt/Decrypt Function
K pre-shared Key
K_h Hash of pre-shared Key
iK intermediate Key (256 bit)
dK derived Key (512 bit) for Encryption/Decryption
P Plaintext
P_h Hash of Plaintext
C Ciphertext
C_m Ciphermessage
R Random Value
S_1 Salt
S_2 Salt
tS temporary Salt inside Hash Iteration Loop
V_h Verification Hash
i Hashing Iteration to define Costs
|| Concatenating Strings
^ Bit-wise exclusive-or of two Byte Values (XOR)
.eq. equal
.neq. not equal
F Function to calculate a new temporary "rolling" Salt
Where (F) take (S_1) or H(S_2||tS) once as first parameter
before the iteration loop, then the resulting (tS)
# before the first iteration loop
tS = S_1
# inside the first iteration loop
for (0 to i):
iK = H(tS || H(iK || K_h))
tS = F(tS,iK,K_h)
# before the second iteration loop
tS = H(S_2 || tS)
# inside the second iteration loop
for (0 to i):
dK = H(tS || H(dK || iK || K_h))
tS = F(tS,dK,iK)
# build the final 512 bit dK
dK = H(dK) || H(dK || iK)
The function (F) itself is building a new salt value (tS) out
of the hash hex strings of (tS), (iK), (K_H) and (dK) which is
hashed on return. It could be seen like a new pseudo-random
value used as a new salt. Of course (F) could be changed in a
more complex function if desired and it offers plenty of room
for modification and/or extension.
The function (F) works for example as follows
F(tS, iK, K_h):
# split hash in arrays of byte
for (i=0 to length H):
h1[i] = byte(tS)
h2[i] = byte(iK)
hk[i] = byte(K_h)
# shuffle byte array h1 based on h2
for (i=0 to length H):
j = int(h2[i]) mod(length H)
swap(h1[i], h1[j])
# shuffle byte array h2 based on hk
for (i=0 to length H):
j = int(hk[i]) mod(length H)
swap(h2[i], h2[j])
return H(h1 ^ h2)
============
(Encryption)
============
S_1 = H(timestamp || R)
S_2 = H(timestamp || R || P_h)
K_h = H(K)
tS = S_1
iK = H(F(tS,iK,K_h) || H(iK || K_h), i)
tS = H(S_2 || tS)
dK = H(H(F(tS,dK,iK) || H(dK || iK || K_h), i)) || H(dK || iK)
V_h = H(iK)
P_h = H(P)
C = E(V_h || P_h || P, dK)
C_m = (S_1 || S_2 || i || C)
============
(Decryption)
============
K_h = H(K)
tS = S_1
iK = H(F(tS,iK,K_h) || H(iK || K_h), i)
tS = H(S_2 || tS)
dK = H(H(F(tS,dK,iK) || H(dK || iK || K_h), i)) || H(dK || iK)
V_h = H(iK)
V_h'= E(C<1..32>byte, dK) <== if (V_h' .neq. V_h) stop Decryption[*]
P_h'= E(C<33..64>byte, dK)
P = E(C<65...>byte, dK)
P_h = H(P) <== if (P_h' .eq. P_h) then P is untampered
[*] At this point a cipher application could ask the user for a new keyword.
A complete working example listing of this algorithm/scheme in Python for generating (dK) can be found here http://www.freecx.co.uk/PKSF/PKSF_algorithm.py
I am quite convinced that the above mentioned function (F) does a good job in distorting the previous hashes and is able to generate a quite reasonable pseudo-random value as new temporary salt for each next hashing round.
One possible extension of (F) would be including a stream cipher which take (K_h) and later (iK) as key and return the hash of each 128KB stream cipher output for example.
Sure it's also possible not ot use (F) at all but the static salt S_1 and S_2 instead which might be already sufficiently strong enough.
iK = H(S_1 || H(iK || K_h), i)
dK = H(H(S_2 || H(dK || iK || K_h), i)) || H(dK || iK)
Of course I am aware that this algorithm/scheme is somehow an overkill and that there are well known and already widely accepted and implemented key stretching algorithms in use like PBKDF2 or bcrypt and I do not really believe that someone will drop them in favour of my idea, but I'd just like to share my thoughts and ideas.
Furthermore any suggestions about weaknesses or commonly known attacks regarding this scheme/algorithm would be of great help. Finally I am happy to hear your opinion at all.
Cheers,
Karl-Uwe