This time, we solve a Java crackme which focuses on InvokeDynamic instruction and has some basic obfuscation. We analyse the java bytecode instructions and use regex to bypass obfuscation.
And then, experiment with dynamic instrumentation to later debug and understand it. Also we talk about the tooling process often required to solve java reversing challenges.
The last two weeks I’ve been fiddling around with reversing java apps and got to try these cool crackmes by @graxcoding.
So the third one checks our input of a 64 bit long integer to validate it.
Jadx fails to extract any classes, not cool!
Also I tried using the jar tool’s xf option and other tools like Bytecode Viewer but got the same result ¯\(ツ)/¯
Then I remembered a video from MalwareAnalysisForHedgehogs where he used this dumper as a java agent to dump the classes from an obfuscated jar file.
Basically our agent class must implement a public static premain method similar in principle to main method. After the JVM has initialized, the premain method will be called, then the real application main method.
It worked perfectly and we have the me_nov_crackme_CrackMe.class file and can work on it.
At first I tried using JADX again but failed…
Then I used Bytecode Viewer and some of the decompilers it comes with but sadly those didn’t work too and I’d had to go with the bytecode.
I just copied the bytecode from Bytecode Viewer and started analysing it.
It looks obfuscated but we can easily notice what is it doing …
For instance look at the following bytecode:
This can be interpreted as the following pseudocode :
And just after this we have what looks like a try-catch block.
Could easily be converted to the following:
Actually I noticed quite a pattern with the xor instructions and wrote a short python script to just comment the result for it.
The first regex searches for two consecutive ldc instruction on Integers.
And the latter one just checks for one ldc and other dup which always results in 0.
The result of these xors is used as predefined integers in the program.
And after some such xors we find some Congratulations and sorry strings which leads us to the checking instruction that makes use of one of the methods ie. method 0 and passes an array to it as an argument.
Tracing back to the array initialisation I found that there were actually two arrays :
Taking help from the instruction set manual we see that they are both byte arrays.
Subsequently we observe bastore in every label below which does some calculations and store the result in these bytearrays.
So only the array with length as 3 is passed and which depends on our input.
Then we can see that there are several methods named numerically other than 0 and I explored them the other day.
Indy in Action
Continuing further it is important we understand about invokedynamic.
We can easily observe the InvokeDynamic instruction at the return of every other method other than main.
The invokedynamic (or simply Indy) is used for optimization and creating efficient java programs and implements a runtime system that can choose the most appropriate implementation of a method or function after the program has been compiled.
For examples we have the following implementations :
Lambda Expressions in Java 8+: LambdaMetafactory
String Concatenation in Java 9+: StringConcatFactory
For instance, in newer versions of Java, String Concatenation is not done by appending the string elements multiple times using StringBuilder append function, instead it places those in an array and makes use of invokedynamic and StringConcatFactory to have a single method call.
Furthering researching about the topic I came across this post by the official JEB blog.
And I thought to give JEB Pro a try to see how well it handles it. Apart from some ambiguous variable names all is fine.
So basically for indy there is a bootStrap method that creates a callsite which points to a handle to a predefined method.
Here 127 looks like the bootstrap method!
Basically, it returns a callsite which points to a method handle it finds when searching for a static method with the value of fun in the class Crackme.
It converts the string passed as an array to a 3 digit integer and does the following operation on it.
So, It monitors the execution flow and depends on the callee function. And later converts it to a utf-8 string and stores it into a static variable fun.
It passes the integer result xored with a specific value as an argument to the method called as a result of invokedynamic.
All the other methods are mostly similar to this except that they differ in this xor operand value.
For extracting those, we can just use a simple regex. Also for 127 we can just fake it with a 0.
We see that it checks if the no. of functions executed in a run is greater than 127 it returns true which should then validate out input.
Perfect this is all we need to know to write a short python script.
Here we check for maximum length of stacktrace which can be achieved and to my surprise it was 22!
Also there are 3 results(191,465,739) with the same stacktrace which ends at 7039.
Then I moved on and wrote used z3 to find the exact 64 bit integer which results in those 3 digit values. Note the operations on the input.
But indeed it validated successfully !!
So, what did I miss?
I wanted to check what was going under the hood and tried some debuggers like jdb(didn’t work).
Also I came across Dr Garbage Tool’s Bytecode Visualizer.
This is an old eclipse plugin set and doesn’t work with newer versions of eclipse.
I was able to install it but alas I don’t know why it wasn’t able to identify the main method.
At last, I reached out for some java bytecode editors to add debug print statements.
Recaf is very easy to use and got a nice UI as well so I went with it.
Also I asked Col-E(Recaf’s Developer) about any good bytecode debuggers and unfortunately, it turns out there aren’t any as of now!
I also shared the weirdness of this jar in extracting the class.
And, Then I got to know about the forward slash trick which was pretty obvious from the jar verbose extraction that I didn’t observe carefully before.
Obviously the decompiler view doesn’t work so we’d have to switch to the class table mode.
Select method_0 and edit with assembler.
We can just add a System.out.println() for variable null1 shown.
If everything goes well, we should see this output.
We can do the same for the fun variable.
But we need some automation for adding these instructions in every method.
And currently Recaf doesn’t have any Automation API.
So to resolve this problem I turned to dynamic instrumentation.
Dynamic Instrumentation using ASM
Just FYI I’m new to the instrumentation part so I checked out some frameworks/libraries which could help me with it.
As it turns out there are several options and I tried some of them such as JavaAssist, ByteBuddy and ASM.
But ASM is at the lowest level and is the base for Bytebuddy and cglib as well, so I went with it!
For verifying your ASM Implementation and how ASM reads your class you can checkout ASMifier.
Here we have v5 as var_1 and the following condition adds these three lines of code after it encounters any (ISTORE, 1) instruction.
Same goes for logging the static variable ie. fun.
I added a string ie. CrackMe.fun = to differentiate both of them.
Cool lets see how it ends.
Weirdness of method_63
Ahh as you can see, I missed the most important part of this crackme, ie. UTF-8 LOL !
So single surrogates are illegal and are converted to ’ ? ‘ character (ie. 63).
FYI Surrogates are characters in the Unicode range U+D800 - U+DFFF.
And Here we notice 56575 which is then converted to 63.
Here it inverts the comparision sign and validates if the length of stacktrace is less than 127.
So the stacktrace length check for other methods is bogus and is only used for deception.
Also the v5 in method_63 always results in 127 which returns true halting the program.