The primary purpose for Obfuscators is to stop users understanding your application. The idea of this is that if you can stop them making sense of your application, you can stop them bypassing licensing steps and/or stealing code. A popular feature for a lot of commercial obfuscators is to take this a step further and stop popular decompilers from even being able to understand your program. While this may sound quite attractive; it can also be quite scary as you need to break many of the "set rules and guidelines" to make this happen. In this article, we'll discuss two of those methods to stop our code from working in Reflector.

What are my options?

Well, to be honest there are likely to be plenty of inventive methods to stop Reflector, and sometimes even ILDASM, MonoDis etc. from being able to decompile your code. In this article, I'll cover two methods that I have discovered over my journeys and attempt to provide the positives and negatives of each alternative. If you know of any more that you think are worth mentioning, please let me know as I'll consider them for NCloak also!

In this particular article, I will only cover methods that maintain an IL structure as opposed to compiling into native format.

Method 1: Use Invalid IL

The first method I'll cover is to simply insert invalid IL opcodes within your code base. This seems to be a fairly popular method, which is also (funnily enough) easy to remove from code also. In a nutshell; assume you had the following code:

  IL_0000:  ldarg.0
  IL_0001:  callvirt   instance char[] [mscorlib]System.String::ToCharArray()
  IL_0006:  stloc.0
  IL_0007:  ldc.i4.0

To fool Reflector; all we need to do is insert an invalid opcode with a branch statement directly before:

  IL_0000:  br.s IL_0004
  IL_0002:  0x00C0
  IL_0004:  ldarg.0
  IL_0005:  callvirt   instance char[] [mscorlib]System.String::ToCharArray()
  IL_0010:  stloc.0
  IL_0011:  ldc.i4.0

As you can see, the program will execute exactly the same as before because we are skipping the invalid operation with the branch statement directly before it. Reflector (at the moment) however doesn't handle the invalid opcode correctly and will instead display: // Invalid method body.

The advantage of this method is that it works well without affecting the runtime of the program whatsoever (well, a negligible difference anyway). The disadvantage is that this may not be forward compatible; the runtime at a later date MAY complain about this as it is outside the scope of the ECMA document. I personally don't think that the runtime will complain about this in the future, however can't make any guarantees! The other disadvantage about this method is that it is also quite trivial to remove from the assembly, allowing us to decompile as before. In a later article I'll provide a simple program to do this without any effort.

Method 2: Modify Header information

The second method that I've come across is to modify the assembly headers as such so that all decompilation fails to run; ILDASM, MonoDIS, and Mono.Cecil included. Now there are various inventive methods to do this such as including unrecognised tables in the header section, or even making typical reader assumptions that aren't made in the framework but are made in the disassemblers.

In a previous article I covered some of the necessary headers that you need to skip through. If you've read this article previously you'll be aware that the parsing of the header section is very exact following some specific rules. This particular method takes advantage of the error handling capabilities between the CLR and disassemblers and utilises them for it's own advantage. Unfortunately, while I have seen this done before in some .NET assemblies, it is difficult to give an exact example of what will break it without some intense investigation!

Now, this method is very effective however I am dubious (and curious) about cross platform capabilities and also how the CLR will handle this in the future. While the first method above (using invalid IL) is taking advantage of the difference between disassemblers and the CLR works; this, in my opinion, is more of a nasty hack. It stops anyone being able to use your assembly without a binary editor which is extremely effective, however at what cost?

Does anyone have any experiences of this method? I'm quite curious to know it's effectiveness in the wild to determine whether it is worth implementing into the obfuscator?

Summary

While there are many variations out there to stop Reflector decompiling your assembly; I think that each of them have their disadvantages. I would expect that any method that works now is only temporary and shouldn't be a main selling point for choosing an obfuscator (however is a nice addon!).

We took a look briefly at two methods: using invalid IL (quite common), and modifying the assembly header information to break any typical readers (less common). Personally, I think that reverse engineering is a reality no matter what security you have used on your assembly. I've even seen hardware based encryption drongals reverse engineered to bypass certain licensing checks! I liken "breaking" Reflector to boarding up the windows of your house. It can be quite effective but it is usually as ugly as hell to do this!

Coming Soon

In coming weeks we'll look at:

  • Implementing the invalid IL technique into NCloak
  • Reverse engineering the invalid IL technique
  • Strategies for writing a GUI
  • Anything else I can think of!

Until then, have a great weekend and week ahead. If you have any questions then please feel free to ask, or alternatively give me an email.

Shout it   kick it on DotNetKicks.com

8 comments:

  1. Laurent Etiemble said...

    You can also have variable stack size at a given execution point: it breaks most of the code paths lookup algorithms.

    This can be done with a "dup" opcode and counter-balance with a "pop" one placed at strategic points.

  2. Anonymous said...

    I'm enjoying the series but it seems a little depressing that if you really want to protect your IP then don't use managed code and stick with c++?

  3. Anonymous said...

    Hi Paul, is possible download the source examples of the old article?

    tivit.co.nz is down.

  4. paulmason said...

    Thanks for that Laurent; I'll definitely take a further look into that one.

    As for the depressing viewpoint; it shouldn't be! In general, every program (C++ or .NET) can be reverse engineered using one technique or another. .NET can be slightly easier to disassemble due to tools available and a standardised architecture however. While every method has a "reverse engineering" tactic (to some extent), it really depends on your target audience. There is no 100% method, therefore you really just need to pick and choose the tools and techniques you need to stop the "bad eggs". The idea of this blog is to hopefully give you an idea as to what feature does what, and how effective it is in reality.

    As for the old source code availability; sorry about that. tivit.co.nz was down for some maintenance and is now back up (hence the samples are now available again). I may look at moving the code samples over to skydrive in the early year however so this need not happen :)

  5. Alex said...

    Thanks for the old source code ^^

    I want to thank you for this wonderful articles, the most interesting I've ever read.

    Thanks for your hard work, I appreciate very much.

  6. Anonymous said...

    Hi,

    Great article.

    Could you please explain how to you recreate the assembly when you add an invalid IL? (Method 1: Use Invalid IL).

    If you run ilasm, you will get an error....

    Tnx in advance,
    Luka

  7. paulmason said...

    Hi Luka,

    Please take a look at the latest article; this describes one such implementation of inserting invalid IL into an assembly by using Mono.Cecil.

    Regards,

    Paul

  8. Anonymous said...

    Hi Paul,

    nice article!

    It would be interessting how to deobfuscate such a file!

    greetings, Peter