I'd like to discuss a key stretching algorithm for symmetric encryption for secure file exchange with pre-shared keywords.

My idea is to use an intermediate key and two different salt where the final key hash will be computed in two separate hashing sessions but additionally inside each hashing session the salt will be altered also by a special function, in the example Python listing below just two simple byte swap of hash values followed by a bit-wise exclusive or calculation in order to generate a "rolling" salt.

Secondly I introduce an additional verification hash (V_h) so that we would have an early notice if the ciphertext was tampered somehow or the keyword was wrong if V_h and V_h' don't match and we can stop any further decryption at this point.

Finally the plaintext authentication hash (P_h) is included within the encrypted message and unknown until the complete decryption process was successful.

Based on common knowledge about salting passwords and due to the second salt, the intermediate key, the excessive use of cascaded hashing and the "rolling" salt the finally computed derived key should be quite strong if a good pre-shared key as input is chosen, which I consider should at least have 40 bit of entropy.



Code:
==========    
(Notation)
==========    

  H       Hash Function    (SHA-256)
  
  E       Encrypt/Decrypt Function
  
  K       pre-shared Key
  K_h     Hash of pre-shared Key
  
  iK      intermediate Key (256 bit)
  dK      derived Key      (512 bit)  for Encryption/Decryption 

  P       Plaintext 
  P_h     Hash of Plaintext 
  
  C       Ciphertext
  C_m     Ciphermessage

  R       Random Value

  S_1     Salt
  S_2     Salt

  tS      temporary Salt inside Hash Iteration Loop

  V_h     Verification Hash

  i       Hashing Iteration to define Costs

  ||      Concatenating Strings

   ^      Bit-wise exclusive-or of two Byte Values (XOR) 

 .eq.     equal 
 
.neq.     not equal

  F       Function to calculate a new temporary "rolling" Salt
  
          Where (F) take (S_1) or H(S_2||tS) once as first parameter 
          before the iteration loop, then the resulting (tS)
          
          
          # before the first iteration loop
          tS = S_1
          
          # inside the first iteration loop
          for (0 to i):
            iK = H(tS || H(iK || K_h))
            tS = F(tS,iK,K_h)

          # before the second iteration loop
          tS = H(S_2 || tS)
          
          # inside the second iteration loop
          for (0 to i):
            dK = H(tS || H(dK || iK || K_h))
            tS = F(tS,dK,iK)

          # build the final 512 bit dK
          dK = H(dK) || H(dK || iK)


          The function (F) itself is building a new salt value (tS) out
          of the hash hex strings of (tS), (iK), (K_H) and (dK) which is
          hashed on return. It could be seen like a new pseudo-random
          value used as a new salt. Of course (F) could be changed in a
          more complex function if desired and it offers plenty of room
          for modification and/or extension.
          
          The function (F) works for example as follows
                    
          F(tS, iK, K_h):
          
            # split hash in arrays of byte
            for (i=0 to length H):
              h1[i] = byte(tS)
              h2[i] = byte(iK)
              hk[i] = byte(K_h)
          
            # shuffle byte array h1 based on h2
            for (i=0 to length H):
              j = int(h2[i]) mod(length H)
              swap(h1[i], h1[j])
              
            # shuffle byte array h2 based on hk
            for (i=0 to length H):
              j = int(hk[i]) mod(length H)
              swap(h2[i], h2[j])
              
            return H(h1 ^ h2)


          

============      
(Encryption)
============

  S_1 = H(timestamp || R)
  
  S_2 = H(timestamp || R || P_h)
    
  K_h = H(K)
  
   tS = S_1
   iK = H(F(tS,iK,K_h)  || H(iK || K_h), i)
      
   tS = H(S_2 || tS)  
   dK = H(H(F(tS,dK,iK) || H(dK || iK || K_h), i)) || H(dK || iK)

  V_h = H(iK)
  
  P_h = H(P)
  

    C = E(V_h || P_h || P, dK)
    
    
  C_m =  (S_1 || S_2 || i || C)



============      
(Decryption)
============      

  K_h = H(K)

   tS = S_1 
   iK = H(F(tS,iK,K_h)  || H(iK || K_h), i)
      
   tS = H(S_2 || tS)  
   dK = H(H(F(tS,dK,iK) || H(dK || iK || K_h), i)) || H(dK || iK)
   
  V_h = H(iK)

  V_h'= E(C<1..32>byte,  dK)  <== if (V_h' .neq. V_h) stop Decryption[*]
 
  P_h'= E(C<33..64>byte, dK)
    

    P = E(C<65...>byte,  dK)


  P_h = H(P)            <== if (P_h' .eq. P_h) then P is untampered

[*] At this point a cipher application could ask the user for a new keyword.

A complete working example listing of this algorithm/scheme in Python for generating (dK) can be found here http://www.freecx.co.uk/PKSF/PKSF_algorithm.py


I am quite convinced that the above mentioned function (F) does a good job in distorting the previous hashes and is able to generate a quite reasonable pseudo-random value as new temporary salt for each next hashing round.

One possible extension of (F) would be including a stream cipher which take (K_h) and later (iK) as key and return the hash of each 128KB stream cipher output for example.

Sure it's also possible not ot use (F) at all but the static salt S_1 and S_2 instead which might be already sufficiently strong enough.

iK = H(S_1 || H(iK || K_h), i)

dK = H(H(S_2 || H(dK || iK || K_h), i)) || H(dK || iK)

Of course I am aware that this algorithm/scheme is somehow an overkill and that there are well known and already widely accepted and implemented key stretching algorithms in use like PBKDF2 or bcrypt and I do not really believe that someone will drop them in favour of my idea, but I'd just like to share my thoughts and ideas.

Furthermore any suggestions about weaknesses or commonly known attacks regarding this scheme/algorithm would be of great help. Finally I am happy to hear your opinion at all.

Cheers,
Karl-Uwe