Security and Compiler Optimizations

A previous post I wrote about password comparison led to much discussion between some colleagues and I; most notably long-time friend and co-author David LeBlanc and Reid Borsuk in the security engineering team.

Reid mentioned an issue that I have not considered in a long time; and that is the potential for compiler optimizations to introduce security vulnerabilities.

The first time I learned of this issue was during the development of Windows Server 2003. Here is the link to my initial message to bugtraq in Nov 2002 on the topic.

Fast forward almost 17 years.

The issue with my initial code is the compiler might generate code that could short-circuit the string comparison. Here’s the code:

static bool TimeConstantCompare(string hashName, string s1, string s2)
      var h1 = HashAlgorithm.Create(hashName).ComputeHash(Encoding.UTF8.GetBytes(s1));
      var h2 = HashAlgorithm.Create(hashName).ComputeHash(Encoding.UTF8.GetBytes(s2));

      int accum = 0;
      for (int i = 0; i < h1.Length; i++)
          accum |= (h1[i] ^ h2[i]);

      return accum == 0;

The code walks over the bytes of each hash; bytes 0..N from hash 1 and bytes 0..N for hash 2. Each byte is XOR’d and if two bytes are different the result is non-zero (it’s a property of XOR) and this result is OR’d with an accumulator. This means, and this is the important part, that if the accumulator ever gets to non-zero it will always be non-zero from that point on.

Now here’s the fun part.

The code returns a bool (true or false) based on the accumulator being zero or not, but the compiler might emit optimized code that no longer performs the rest of the loop if at ANY point the accumulator is not zero.

For example, let’s say the hashes are 32-bytes in size. In theory, the loop will ALWAYS iterate from 0-31. However, let’s say when i==17, the two bytes are 0xDE and 0xAD, this will yield a value of 0xDE XOR 0xAD == 0x73 and if the accumulator is 0x00, then 0x00 OR 0x73 == 0x73. The compiler knows that from this point forward (because of how OR works), the accumulator will always be >= 0x73. The last line of the function returns a value based on the accumulator == 0, the compiler knows that the accumulator can NEVER be 0x00 from this point on, so skip the rest of the loop!

I hope that makes sense!

This isn’t a huge issue because we’re comparing a hash and not the actual password. But it’s still valuable to know compilers can do this.

There are two fixes, here’s the updated method:

[MethodImpl(MethodImplOptions.NoInlining | MethodImplOptions.NoOptimization)]        
static int TimeConstantCompare(string hashName, string s1, string s2)
     var h1 = HashAlgorithm.Create(hashName).ComputeHash(GetRawBytes(s1));
     var h2 = HashAlgorithm.Create(hashName).ComputeHash(GetRawBytes(s2));

     int accum = 0;
     for (int i = 0; i < h1.Length; i++)
          accum |= (h1[i] ^ h2[i]);

     return accum;

First, the method returns an int rather than a bool, so the compiler has no clue what you’re doing with an int. In theory, the compiler could look at how the code calling this function uses the return value to optimize the code, but inter-procedural optimizations are harder and less common than intra-procedural optimizations. It would be very hard for the compiler to ascertain how to optimize this method based on every call in the code to the method.

The real fix is to add the NoOptimization directive before the method implementation. This prevents the optimizer from doing little tricks like this. The obvious downside is a potential performance hit. If this is code that’s called constantly, in a tight loop, then you may see degradation. But like everything in life, measure!

Reid provided me with some more in-depth detail about inter- vs intra-procedural optimizations. You probably don’t need to know this, but if you’re nosey, read on.

This statement is true for traditional compilers, but the .NET JIT is prohibited from inter-procedural optimizations only in the case where both NoInlining and NoOptimization is enforced, not just NoOptimization. That’s because the .NET runtime requires method signatures to be carried with the objects they represent. This means every instance of Program (including the static one) carries the pointer to the compiled method (or JIT stub pre-compile), so every caller needs to call the function in the same way. The caller would then be restricted to calling the “standard” implementation of the method without this cheat.
The way .NET implements the scenario you’re talking about is to force an inline to occur. After inlining occurs, the NoOptimization attribute gets stripped since the method disappeared. Then they force optimization on the new parent method with code already inlined. Preventing both inlining and optimization prevents the specific optimization you refer to in your blog post.

My updated code for string comparison is on GitHub

On a side note, you may notice the code to convert a string into a series of bytes is different. That’s a topic for another post 🙂