Source code used in this article
NCloak Obfuscation Tool

In our last article we took a look at Mono.Cecil and how powerful that one tool is in regards to assembly manipulation at byte level. In this article we will look at utilising this tool further and perform basic obfuscation on an assembly.

Before we get into this though we need to consider a few things:

  • What are we renaming?
  • What are we renaming it to?
  • Do we need to keep track of references?

After we've answered these questions, we'll then take a look at the tool in action.

What are we renaming?

One important thing to note in obfuscation is that we don't want to rename everything. The simple reason for this is public types and public members; if an external assembly is referencing a public type and we rename it then we will break the code unless we also rename the reference in the external assembly. To keep things simple today, we will rename private members only. In a future article we will look at renaming public and internal members across a set of assemblies; this is of course useful if you are distributing an application as a whole and not expecting any third party vendors to reuse your assemblies.

Why don't you rename internal members today?

Internal members are a funny breed; namely because they can be treated as public to certain trusted assemblies via the attribute: InternalsVisibleToAttribute. This is an extremely useful feature of internal members, however for simplicity reasons today - we'll stick with renaming private members only.

What do you rename members to?

Very good question; we want to rename the members to something that means nothing to the people that may read it, but still retain meaning with the framework. To better answer this question, we should investigate what characters the framework allows for the name of an identifier.

In Partition I of the ECMA we can find under chapter 8.5 that a valid name is any unicode character, and that they need differ only by case. To become CLS compliant, we need to ignore case, however let's ignore that rule for now and just concentrate on outputting an obfuscated assembly (CLS compliance can be an option at a later date).

So this is good news! It means that we can name our identifiers ANY name, so long as it is a valid unicode character! We might as well pick a non-english character set to encode too then...

What about references; do we need to track them?

We won't need to track references this round, because we are dealing with private members only. Once we start looking at public and internal members, we will need to start keeping track of what we changed and where. But, let's keep today's article nice and simple...

Implementing the code

We'll implement the code in a two step process:

  1. Implement a naming container
  2. Enumerate assembly members

First off, our naming container. To manage this, I created two classes; a NameManager which handles naming conventions for each of types, methods, properties and fields, and a CharacterSet which handles the individual logic for each of these sub tables. First the CharacterSet:

public class CharacterSet
   {
       private readonly char startCharacter;
       private readonly char endCharacter;

       /// 
       /// Initializes a new instance of the  class.
       /// 
       /// The start character.
       /// The end character.
       public CharacterSet(char startCharacter, char endCharacter)
       {
           this.startCharacter = startCharacter;
           this.endCharacter = endCharacter;
           CurrentCharacter = startCharacter;
           Prefix = String.Empty;
       }

       /// 
       /// Gets or sets the prefix for names in this set.
       /// 
       /// The prefix.
       public string Prefix { get; private set; }

       /// 
       /// Gets or sets the current character being used.
       /// 
       /// The current character.
       public char CurrentCharacter { get; private set; }

       /// 
       /// Gets the end character for this set.
       /// 
       /// The end character.
       public char EndCharacter
       {
           get { return endCharacter; }
       }

       /// 
       /// Gets the start character for this set.
       /// 
       /// The start character.
       public char StartCharacter
       {
           get { return startCharacter; }
       }

       /// 
       /// Generates a new name.
       /// 
       /// A unique name based upon the character set settings
       public string Generate()
       {
           //Get the name
           string newName = String.Format("{0}{1}", Prefix, CurrentCharacter);

           //Increment our state
           CurrentCharacter++;

           //Check if we're over our quota
           if (CurrentCharacter > EndCharacter)
           {
               //We need to roll over to a new prefix
               if (String.IsNullOrEmpty(Prefix))
                   Prefix = startCharacter.ToString();
               else
               {
                   //TODO - we need a proper implementation here
                   Prefix = Prefix + startCharacter;
               }
           }

           //Return it
           return newName;
       }
   }

The intention of this class is to maintain a set of characters that are valid for an identifier name between startCharacter and endCharacter. This class also maintains the current character (like an enumeration) and will eventually handle rolling over once we have exhausted all possibilities.

Our next class is the NameManager:

public class NameManager
   {
       private readonly Dictionary namingTables;
      
       /// 
       /// Initializes a new instance of the  class.
       /// 
       public NameManager()
       {
           namingTables = new Dictionary();
       }

       /// 
       /// Sets the start character.
       /// 
       /// The table.
       /// The new character set to use.
       public void SetCharacterSet(NamingTable table, CharacterSet characterSet)
       {
           if (namingTables.ContainsKey(table))
               namingTables[table] = characterSet;
           else
               namingTables.Add(table, characterSet);
       }

       /// 
       /// Generates a new unique name from the naming table.
       /// 
       /// The table to generate a name from.
       /// A unique name
       public string GenerateName(NamingTable table)
       {
           //Check the naming table exists
           if (!namingTables.ContainsKey(table))
               SetCharacterSet(table, DefaultCharacterSet);

           //Generate a new name
           if (table == NamingTable.Field) //For fields append an _ to make sure it differs from properties etc
               return "_" + namingTables[table].Generate();
           return namingTables[table].Generate();
       }

       /// 
       /// Gets the default character set.
       /// 
       /// 
       private static CharacterSet DefaultCharacterSet
       {
           get { return new CharacterSet('\u0800', '\u08ff'); }
       }
   }

As you can see, this maintains a Dictionary mapping of NamingTable to CharacterSet. NamingTable is an enumeration detailed below, whereas CharacterSet is the class outlined above. The two important points of this class are the GenerateName method, and the DefaultCharacterSet. In the GenerateName method we rely on the individual CharacterSet to maintain state within each "name table" with one exception - fields. We give a prefix of "_" for each field, just to be on the safe side. This is to avoid any clashes with fields and properties of the same name (case-sensitive). The default character set, as you can see is a total of 255 characters long. I chose this particular range as it is exposed as a square in most english speaking countries.

For completeness I'll outline the possible "name tables" exposed via the NamingTable enumeration:

   public enum NamingTable
   {
       Type,
       Method,
       Property,
       Field
   }

Now that we've dealt with how to generate names - what does our obfuscate method look like? To be fair it is very familiar to our post last week...

       /// 
       /// Obfuscates the specified assembly.
       /// 
       /// The assembly.
       private void Obfuscate(string assembly)
       {
           //Get the assembly definition
           AssemblyDefinition definition = AssemblyFactory.GetAssembly(assembly);

           //Keep a dirty bit for saving
           bool isDirty = false;

           //Go through each module
           foreach (ModuleDefinition moduleDefinition in definition.Modules)
           {
               //Go through each type
               foreach (TypeDefinition typeDefinition in moduleDefinition.Types)
               {
                   //We can't rename types yet - we don't know enough!

                   //Go through each method
                   foreach (MethodDefinition methodDefinition in typeDefinition.Methods)
                   {
                       if (methodDefinition.IsPrivate)
                       {
                           //Rename
                           methodDefinition.Name = Settings.NameManager.GenerateName(NamingTable.Method);
                           isDirty = true;
                       }
                   }

                   //Properties
                   foreach (PropertyDefinition propertyDefinition in typeDefinition.Properties)
                   {
                       //Rename only if the whole property is private
                       if (propertyDefinition.GetMethod != null && propertyDefinition.SetMethod != null)
                       {
                           //Both parts need to be private
                           if (propertyDefinition.GetMethod.IsPrivate && propertyDefinition.SetMethod.IsPrivate)
                           {
                               //Rename
                               propertyDefinition.Name = Settings.NameManager.GenerateName(NamingTable.Property);
                               isDirty = true;
                           }
                       }
                       else if (propertyDefinition.GetMethod != null)
                       {
                           //Only the get is present - make sure it is private
                           if (propertyDefinition.GetMethod.IsPrivate)
                           {
                               //Rename
                               propertyDefinition.Name = Settings.NameManager.GenerateName(NamingTable.Property);
                               isDirty = true;
                           }
                       }
                       else if (propertyDefinition.SetMethod != null)
                       {
                           //Only the set is present - make sure it is private
                           if (propertyDefinition.SetMethod.IsPrivate)
                           {
                               //Rename
                               propertyDefinition.Name = Settings.NameManager.GenerateName(NamingTable.Property);
                               isDirty = true;
                           }
                       }
                   }

                   //Fields
                   foreach (FieldDefinition fieldDefinition in typeDefinition.Fields)
                   {
                       //Rename if private
                       if (fieldDefinition.IsPrivate)
                       {
                           fieldDefinition.Name = Settings.NameManager.GenerateName(NamingTable.Field);
                           isDirty = true;
                       }
                   }
               }
           }

           //Save the assembly if it is dirty
           if (isDirty)
           {
               string outputPath = Path.Combine(Settings.OutputDirectory, Path.GetFileName(assembly));
               Console.WriteLine("Outputting assembly to {0}", outputPath);
               AssemblyFactory.SaveAssembly(definition, outputPath);
           }
       }

In generic terms, we simply go through each type in the assembly and rename all private methods, properties, and fields. We ignore types at this stage for simplicity reasons - perhaps next week we'll look at renaming types via public and internal access modifiers.

Testing out code

N.B. To test this code yourself, please download the console application example found here.

For testing purposes, I wrote a simple class which doesn't do much at all:

using System;

class Program
{
 public static void Main(string[] args) {
   MyLogic logic = new MyLogic();
   logic.Run();
   Console.WriteLine("The result of running the logic is: " + logic.GetResult());
 }
}

public class MyLogic
{
 private int variable;

 public void Run()
 {
   variable = Add(5,8);
   PrivateProperty = "Hello World";
 }

 private int Add(int x, int y)
 {
   return x + y;
 }

 public string GetResult() {
   return String.Format("{0} - {1}", PrivateProperty, variable);
 }

 private string PrivateProperty {
   get; set;
 }
}

The expected output of this program is a simple text string:

The result of running the logic is: Hello World - 13

Before running our obfuscater over this code, this assembly looks like so in Reflector:

Our code before obfuscation Our code before obfuscation

After running the obfuscater over the code, our assembly looks like:

Our code after obfuscation Our code after obfuscation

It worked and runs with exactly the same output! As you can see, we've made a good start to obfuscation, however I think we can do better. We'll look further into this in the next articles.

Source Code

The most up to date source for NCloak can be found at the NCloak project hosting page. The code found on the project page is likely to evolve regularly. Alternatively, you can get the complete source code specific to this blog here.

Conclusion

As you can see from this example: with the help of Mono.Cecil, writing an obfuscater is likely to be a cinch! While we've only really done a basic implementation of one, it didn't really take much code at all to put into place. In the coming weeks we'll extend on this, and take a look at:

  • Obfuscating public members
  • Making the code unreadable by .NET reflector
  • Breaking in to the obfuscated code (was it worth it?)
  • Tamper Proofing
  • Anything else I might think of :)

In the meantime - please subscribe to my post, and remember to vote for this article using the buttons below. Till next time...

kick it on DotNetKicks.com   Shout it

5 comments:

  1. Tyron said...

    Another great article Paul!
    Looking forward to the progress of NCloak!

  2. Thushan Fernando said...

    Great start to NClock Paul, another - although no source, is Babel Obfuscator which I'm using for some of the products we ship.

    http://code.google.com/p/babelobfuscator/

    But I'll be interested to see where NClock goes and will be keeping an eye on it :-)

    The way I see it, theres only so much you can really do, if your product is really in demand you will find someone somewhere reversing it and finding a way through your licensing code. But atleast with such tools you can semi protect your IP from simple reverse engineering.

  3. paulmason said...

    Thanks for your comments!

    You are exactly right Thushan; there will always be someone willing to put the time and effort to reverse engineer your product. The purpose of an obfuscater and other such tools can never really be to stop this from happening but more to act as a deterrent.

    Thanks for pointing out Babel Obfuscator also; it is good to be aware of all of the available options out there to make sure the right tool for the job is chosen.

  4. Anonymous said...

    Great article Paul!

    See my tutorial: How to recompile and change an ofuscated assembly (pdf document. Spanish language)

    The .pdf link: http://blogs.gamefilia.com/ollydbg/05-08-2009/25261/anexo-tutorial-13-recompilando-un-ensamblado-ofuscado

  5. Anonymous said...

    I found really strange thing while trying to optimize renaming in my aasembly. Using Mono.Cecil I can reanme all class members (fields, methods ...) with the same name. What do you thing about it?