A Tale of .Net Deobfuscation - VirtualGuard Devirtualization
This is the second part on the VirtualGuard Protector series which focuses on the virtualization techniques. I write a devirt for the VMs implemented in VirtualGuard using AsmResolver. If you missed the first part about basic deobfuscation techniques in .Net samples, you can read it here.
There are two different VMs released for VirtualGuard till now ie. Spider and Crocodile. Spider was released first and is trivial to solve as it has almost 1:1 mapping with the CIL Instruction set. We’ll talk about the Crocodile VM in this post, as Spider being very trivial is left as an exercise for the reader.
From the first part we now have the deobfuscated binaries of the crackmes which will make our task much easier. You could also download them from the links below :
Cleaned Spider 🕸 | Cleaned Crocodile 🐊
Long story short, I did some debugging and wrote a toy disassembler for the Crocodile VM. You can checkout the disassembled output here. Although the challenge had been solved with it but Mito motivated me to develop a full devirt for it.
I chose AsmResolver over dnlib to code the Devirt this time because :
- I liked its syntax and it also has prety good documentation here.
- It is deveoped by Washi, who is also the author of OldRod and therefore I’m a fan.
- Heck yeah! Why Not?
I followed the coding standards of other Devirts like MemeDevirt as it divides a VM Devirt into various stages thus making it easier to follow along.
We’ll now discuss about all these stages in detail.
The VMInit method gets the resouce “crocodile” from the executable.
This is the VM Bytecode. We can get the resource data using AsmResolver in the following way.
ManifestResource resource = (from x in module.Resources
We now need to identify all the virtualized methods and gather information about them.
I just wrote a simple signature for their detection that checks whether a call to the method A4FD01::2CDA08 exists, ie. the VM Dispatcher. This isn’t a good signature as the method name could be renamed in another executable but for now let’s not worry about it. We need other information such as token used for the method disassembly (disasConst) and the type signatures of the arguments to the dispatcher.
if (methodInstr[instrCount - 3].OpCode == CilOpCodes.Call
The constructor calls the VMInit method with an integer argument.
Every byte from the VM Bytecode is first xored with this constructor argument. It is also used as a xor operand for calculating the number of VM blocks afterwards.
The VMInit function parses the VM Bytecode for block instructions. There are two operands for every instruction. Our code for parsing looks pretty similar.
int numBlocks = _reader.ReadInt32() ^ cctorArg;
Lets talk about disassembling the VM now. We find a typical VM Dispatcher with a while loop executing VM instructions which depends on a State variable to continue execution.
The Execution State is set to CONTINUE. The stack is initialized with a size of 10 elements and local variables list with a size of 60 variables.
The Crocodile VM consists of 25 Handlers.
Lets take an example of the PUSH instruction implemented in the VM.
There are three variants of the PUSH instruction.
- 0x11 : PUSH Member
- 0x15 : PUSH Constant
- 0x12 : PUSH LocalVariable
A value is pushed to stack with the following method :
0x12 pushes a local variable and is simply devirted into ldloc
Some instructions are devirtualized with pattern matching.
- 0x11 pushes the resolved member from its metadata token and is always used for the call instruction. Therefore, we don’t add a new instruction for it and make use of the resolved member in the call instruction directly.
- 0x15 pushes a constant value but it is always used in conjunction with the same instruction which translates it to ldarg: The above instruction pattern translates to ldarg.1. The succeeding instruction ie. (0x6 with second operand as 0x3c) is then skipped.
0x6 0x00000000 0x0000003c
Here pattern matching helps us to get a good decompilation result.
Lets have a brief overview of the Call Instruction implemented by the VM.
Here is a good example from the disassembled output.
V_5 = 0x0000302c
It finds a method from its metadata token and store it in a local variable.
module.TryLookupMember((MetadataToken)operand2, out resolvedMember);
It then makes use of this local variable to call the method. The calling convention here is opposite, as it pushes the arguments in Left-to-Right order. Therefore, we can reverse the prior PUSH instructions to make it right.
int numParams = resolvedMethod.Parameters.Count;
Also if its a constructor being called, then it should be a newobj instruction.
We also need to keep a track of the type information for local variables. For this I use a stack as well. If there is a call to a method, we need to note the return type and assign it to the local variable where the return value is being popped into.
if (returnType.ElementType != ElementType.Void)
Finally, the above call would be translated into the following:
Maths maths = new Maths(0x302C, 0xA68A);
Main part of the Replace Phase is replacing the Branch Instructions as the target instruction is represented by another block token instead of an offset from the beginning of the instruction following the current instruction.
V_19 = 0x000947b4
So my idea was to devirt following the Br instructions till we encounter a Ret instruction and in the end, we could proceed with the conditional branches. We can either opt for recursion or stack based approach. I chose the latter.
Stack<(int, CilInstructionLabel)> simpleBrStack = new Stack<(int, CilInstructionLabel)>();
We also need to link the Br Instructions to their target instruction. This could be done later on with the help of AsmResolver’s CILInstrucionLabel.
My devirt handler for the Br instructions is the following:
simpleBrLabel = new CilInstructionLabel();
After disassembling, we can link the label instruction to the first instruction of the block.
_currentBlockLabel.Instruction = BlockIns;
Mito also told me that the opcode for handlers is randomized for every other protected executable. One way to deal with it is to write signatures for those handlers or manually point them out by maybe developing an extension using dnspyEx. I didn’t bother doing that as I didn’t want to put that much effort at this time.
Also we now no longer need the module constructor to initialize the VM so we can remove the VMInit call.
var instrs = moduleCtr.CilMethodBody.Instructions;
The devirtualized code could be simplified further with the help of de4dot.blocks to get a better decompilation. I made a simple utility for it. This is the devirtualized main method.
The Validate method looks like the following :
The crack is rather simple and it just checks whether character ‘d’ occurs twice in our password. FYI, There was a bug in the original crackme which compares the password characters to “100” instead of “d” (ie. chr(100)) and to my surprise, my devirt fixed it 😂.
It was my first time developing a devirt for any crackme so it was a fun adventure. Thanks for it, Mito. I’m looking forward to the next VirtualGuard VM. We may also get to read Mito’s blog on VirtualGuard from a developer’s perspective pretty soon 🤞.
You could download the devirtualized version of the crackmes from the links below :
Devirtualized Spider 🕸 | Devirtualized Crocodile 🐊
I’ve also open sourced the VirtualGuard Devirt. Here you go!