A Tale of .Net Deobfuscation - VirtualGuard Devirtualization

Suraj Malhotra

2023-04-07

Tools

.Net, Obfuscation, vm

This is the second part on the VirtualGuard Protector series which focuses on the virtualization techniques. I write a devirt for the VMs implemented in VirtualGuard using AsmResolver. If you missed the first part about basic deobfuscation techniques in .Net samples, you can read it here.

Introduction

There are two different VMs released for VirtualGuard till now ie. Spider and Crocodile. Spider was released first and is trivial to solve as it has almost 1:1 mapping with the CIL Instruction set. We’ll talk about the Crocodile VM in this post, as Spider being very trivial is left as an exercise for the reader.
From the first part we now have the deobfuscated binaries of the crackmes which will make our task much easier. You could also download them from the links below :

Cleaned Spider 🕸 | Cleaned Crocodile 🐊

Initial Analysis

Long story short, I did some debugging and wrote a toy disassembler for the Crocodile VM. You can checkout the disassembled output here. Although the challenge had been solved with it but Mito motivated me to develop a full devirt for it.

Devirtualization

I chose AsmResolver over dnlib to code the Devirt this time because :

I liked its syntax and it also has prety good documentation here.
It is deveoped by Washi, who is also the author of OldRod and therefore I’m a fan.
Heck yeah! Why Not?

I followed the coding standards of other Devirts like MemeDevirt as it divides a VM Devirt into various stages thus making it easier to follow along.

We’ll now discuss about all these stages in detail.

AnalyzeResources

The VMInit method gets the resouce “crocodile” from the executable.

This is the VM Bytecode. We can get the resource data using AsmResolver in the following way.

ManifestResource resource = (from x in module.Resources 
                                    where x.Name == "crocodile" 
                                       	select x).First();
VM.CrocodileBytecode = resource.GetData();

AnalyzeMethod

We now need to identify all the virtualized methods and gather information about them.

I just wrote a simple signature for their detection that checks whether a call to the method A4FD01::2CDA08 exists, ie. the VM Dispatcher. This isn’t a good signature as the method name could be renamed in another executable but for now let’s not worry about it. We need other information such as token used for the method disassembly (disasConst) and the type signatures of the arguments to the dispatcher.

if (methodInstr[instrCount - 3].OpCode == CilOpCodes.Call
    && methodInstr[instrCount - 3].Operand.ToString().Contains("A4FD01::2CDA08"))
{
    int disasConst = methodInstr[instrCount - 4].GetLdcI4Constant();
    List<TypeSignature> paramTypes = new List<TypeSignature>();

    foreach (var item in methodInstr)
    {
        if (item.OpCode == CilOpCodes.Ldarg)
        {
            Parameter pp = item.Operand as Parameter;
            paramTypes.Add(pp.ParameterType);
        }
    }
    VM.MethodVirt.Add(new VMMethod(method.FullName, disasConst, paramTypes, method));
}

InitializeMethod

The constructor calls the VMInit method with an integer argument.

Every byte from the VM Bytecode is first xored with this constructor argument. It is also used as a xor operand for calculating the number of VM blocks afterwards.

The VMInit function parses the VM Bytecode for block instructions. There are two operands for every instruction. Our code for parsing looks pretty similar.

int numBlocks = _reader.ReadInt32() ^ cctorArg;
List<int> blockTokens = new List<int>();
for (int i = 0; i < numBlocks; i++)
{
    List<VMInstruction> blockInstructions = new List<VMInstruction>();
    int token = _reader.ReadInt32() ^ numBlocks;
    int numInstr = _reader.ReadInt32() ^ numBlocks;
    for (int j = 0; j < numInstr; j++)
    {
        VMInstruction vmInstr = new VMInstruction
        {
            Opcode = _reader.ReadByte(),
            Operand1 = VMInstruction.ReadOperand(_reader),
            Operand2 = VMInstruction.ReadOperand(_reader),
        };
        blockInstructions.Add(vmInstr);
    }
    VM.Blocks.Add(token, blockInstructions);
}

InitializeReplace

Lets talk about disassembling the VM now. We find a typical VM Dispatcher with a while loop executing VM instructions which depends on a State variable to continue execution.

The Execution State is set to CONTINUE. The stack is initialized with a size of 10 elements and local variables list with a size of 60 variables.

The Crocodile VM consists of 25 Handlers.

Push Instruction

Lets take an example of the PUSH instruction implemented in the VM.
There are three variants of the PUSH instruction.

0x11 : PUSH Member
0x15 : PUSH Constant
0x12 : PUSH LocalVariable

A value is pushed to stack with the following method :

0x12 pushes a local variable and is simply devirted into ldloc

Some instructions are devirtualized with pattern matching.

0x11 pushes the resolved member from its metadata token and is always used for the call instruction. Therefore, we don’t add a new instruction for it and make use of the resolved member in the call instruction directly.
0x15 pushes a constant value but it is always used in conjunction with the same instruction which translates it to ldarg:
1
2
0x15 0x00000001
0x6 0x00000000 0x0000003c
The above instruction pattern translates to ldarg.1. The succeeding instruction ie. (0x6 with second operand as 0x3c) is then skipped.

Here pattern matching helps us to get a good decompilation result.

Call Instruction

Lets have a brief overview of the Call Instruction implemented by the VM.
Here is a good example from the disassembled output.

V_5 = 0x0000302c 
V_6 = 0x0000a68a 
resolve method V_7 => System.Void VirtualGuard.Tests.Maths::.ctor(System.Int32, System.Int32)
push V_6
push V_5
call V_7
pop V_8

It finds a method from its metadata token and store it in a local variable.

1	module.TryLookupMember((MetadataToken)operand2, out resolvedMember);

It then makes use of this local variable to call the method. The calling convention here is opposite, as it pushes the arguments in Left-to-Right order. Therefore, we can reverse the prior PUSH instructions to make it right.

1 2	int numParams = resolvedMethod.Parameters.Count; BlockIns.Reverse(BlockIns.Count - numParams, numParams);

Also if its a constructor being called, then it should be a newobj instruction.

We also need to keep a track of the type information for local variables. For this I use a stack as well. If there is a call to a method, we need to note the return type and assign it to the local variable where the return value is being popped into.

1 2	if (returnType.ElementType != ElementType.Void) typeInfoStack.Push(returnType);

Finally, the above call would be translated into the following:

1	Maths maths = new Maths(0x302C, 0xA68A);

Branch Instruction

Main part of the Replace Phase is replacing the Branch Instructions as the target instruction is represented by another block token instead of an offset from the beginning of the instruction following the current instruction.

V_19 = 0x000947b4 
brfalse V_19
V_20 = 0x000dcd6c 
br V_20

So my idea was to devirt following the Br instructions till we encounter a Ret instruction and in the end, we could proceed with the conditional branches. We can either opt for recursion or stack based approach. I chose the latter.

Stack<(int, CilInstructionLabel)> simpleBrStack = new Stack<(int, CilInstructionLabel)>();
Stack<(int, CilInstructionLabel)> condBrStack = new Stack<(int, CilInstructionLabel)>();

while (simpleBrStack.Count > 0 || condBrStack.Count > 0)
{
    if (simpleBrStack.Count > 0)
    {
        // we need to disassemble a simple branch
        var simpleBrData = simpleBrStack.Pop();
        _currentBlockAddr = simpleBrData.Item1;
        _currentBlockLabel = simpleBrData.Item2;
    }
    else if (condBrStack.Count > 0)
    {
        // we need to disassemble a conditional branch
        var condBrData = condBrStack.Pop();
        _currentBlockAddr = condBrData.Item1;
        _currentBlockLabel = condBrData.Item2;
    }
}

We also need to link the Br Instructions to their target instruction. This could be done later on with the help of AsmResolver’s CILInstrucionLabel.
My devirt handler for the Br instructions is the following:

simpleBrLabel = new CilInstructionLabel();
BlockIns.Add(new CilInstruction(CilOpCodes.Br, simpleBrLabel));
destAddr = localsStore[operand1];
simpleBrStack.Push((destAddr, simpleBrLabel));

After disassembling, we can link the label instruction to the first instruction of the block.

1	_currentBlockLabel.Instruction = BlockIns[0];

Mito also told me that the opcode for handlers is randomized for every other protected executable. One way to deal with it is to write signatures for those handlers or manually point them out by maybe developing an extension using dnspyEx. I didn’t bother doing that as I didn’t want to put that much effort at this time.

Also we now no longer need the module constructor to initialize the VM so we can remove the VMInit call.

1 2	var instrs = moduleCtr.CilMethodBody.Instructions; instrs.RemoveRange(0, 2);

The Crack

The devirtualized code could be simplified further with the help of de4dot.blocks to get a better decompilation. I made a simple utility for it. This is the devirtualized main method.

The Validate method looks like the following :

The crack is rather simple and it just checks whether character ‘d’ occurs twice in our password. FYI, There was a bug in the original crackme which compares the password characters to “100” instead of “d” (ie. chr(100)) and to my surprise, my devirt fixed it 😂.

Conclusion

It was my first time developing a devirt for any crackme so it was a fun adventure. Thanks for it, Mito. I’m looking forward to the next VirtualGuard VM. We may also get to read Mito’s blog on VirtualGuard from a developer’s perspective pretty soon 🤞.

You could download the devirtualized version of the crackmes from the links below :
Devirtualized Spider 🕸 | Devirtualized Crocodile 🐊
I’ve also open sourced the VirtualGuard Devirt. Here you go!
https://github.com/mrT4ntr4/VirtualGuard-Devirt

References

https://github.com/congviet/MemeVMDevirt
https://github.com/Mageland29/CawkVM-Devirter
https://github.com/Washi1337/OldRod
https://github.com/saneki/eazdevirt