Encryption vs compression detection
I've been investigating ways to distinguish between data that is compressed and data that is encrypted. Entropy is a good way of finding scrambled data but it cannot tell the difference between compressed and encrypted blocks.
With this code, instead of looking at the frequency of occurrence of bytes in the file, we treat the file as if it is the output of a Boolean function and we look at the type of equations that must give rise to this output sequence. This method is used to test the quality of random number generators.
You can find my C++ implementation of the Walsh-Hadamard transform attached. The idea was eventually to build this measurement into some kind of GUI tool for people to use, but I'm not sure that I'm getting good results with it.
You will have to compile it yourself if you want to try it out, but you might just be interested in the code.
|