10-02-2009 6:49PM (ET)
While taking Ron Rivest's class 6.857: Computer and Network Security (an incredible class that I can't recommend enough to anyone interested in computer security) we were assigned to review the hash functions submitted to the NIST SHA-3 cryptographic hash function contest. Each homework group received a list of four hash functions to review. I choose to review Spectral Hash as I found the name entertaining.
The page for the student reviews can be found here. A careful reader might notice that my team choose not to make our reviews public. This was because I thought I had discovered an attack against Spectral Hash and wanted time to double check my results. I was concerned that a programming error may have been responsible for the collisions I was finding. While it turned out that I had made a programming error it was also the case that my attack was correct. After further analysis I contacted the authors of Spectral Hash and released the pre-print Attacks Against Permute-Transform-Xor Compression Functions and Spectral Hash.
The Spectral Hash compression function (each $C$ in Figure 1 is a call to the compression function) takes three parameters: a message $m$, an array of numbers $p$, and the output of the previous call to the compression function $h$. This is how the compression function1 works:
The problem in Spectral Hash arises because each time you permute $p$ using $m$ therefore if you use the same $m$ you permute $p$ in exactly the same way. For an illustrative example consider a small deck of 5 cards. Ace=A, Jack=J, Queen=Q, K=King, and Joker=S.
Before we shuffle the cards they are in the following order [A, J, Q, K, S]. Lets say decide to perform the following shuffle (permute): we swap the first two cards, move the third card to the forth position, move the forth card to the fifth position and move the fifth card to the third position. The order of the deck will evolve like this:
It is a known rule of group theory and card shuffling that applying the same permutation or shuffle over and over again will evidently return the deck it's starting position (that is, unshuffle the previous shuffles). The number of iterations required to return to the starting position is called the order or period of the permutation. In our example our shuffle had an order of 5. Given a deck of $n$ cards the maximum possible order is given by the Landau function. As can be seen here, for a decent sized $n$ the order can be quite large. In fact there is card deck shuffling simulation software for stage magicians to model various shuffles 1 used used in magic tricks.
The attack is simply this: find values of the message block $m$ that result in a permutation with a small order. Repeat these message blocks until $p$ returns to its starting state and continue providing the same value for $m$ until $h$ returns to initial value. Since two different values of $m$ will result in the same $h$, that is $h=IV$ you can generate arbitrary collisions2.
The compression function defined in the Spectral Hash specification is not presented in this form. One of the results that I show in my paper is that Spectral Hash can be reformulated/reduced into this simple form. ↩↩
The exact details are slightly more complicated than causing a collision in $h$ because Spectral Hash has a finalization function that protects against length-extension attacks, but defeating the finalization function is only trivially harder. ↩