No doubt. For all of us who's been involved in reversing software the de facto standard tool is the
Interactive Disassembler (IDA). It is not just good for reversing but also sometimes good for finding complex problems. It is like heavy artillery. It gets through everything.
Having said that this week Lady IDA misled me heavily and took from me two precious days. After unfolding the truth I felt that I need to share...
So it goes like this.
Given an object file created from C code. The format is COFF and it contains ARM code (type 0x1c4 to be precise). The object file contains a single function called "fibo" which is recursive (strange, huh?! :)).
When linked this object file to a DLL and ran that on Windows Phone I saw that it got to an endless loop in the "bl" instruction (roughly the parallel of call in x86) because it was jumping to the same address in the memory where the instruction is located. I was looking for the bad guy in the building chain. I've opened the object file in IDA and it showed this.
You can see that at address 0x10 in the text section there is the recursive call (bl _fibo) which is fine and seems at first glance that the object file is ok. Even before this week I knew that IDA takes sometimes the information of the branch/jump target from the relocation table of the object file with no regard to the address which was encoded as a literal in the binary instruction. Therefore I took the double instruction word (Thumb 2) from IDA for the bl instruction and using the ARM reference manual decoded it. As show in the screenshot above the instruction word in the binary is "FFF7 F6FF" which means "bl -20" or jump back 20 bytes. Now we know that on ARM CPU's the PC register is 4 bytes ahead always, which means in this case actually the instruction causes the CPU to jump back 16 bytes or 0x10.
This is good since this instruction causes the CPU to jump to the beginning of the function which is exactly what we want and what IDA shows. Everything is fine with the object file so the problem must be either in the linker or the loader.
Here I spent 2 days of trying to find out what went wrong in the linker...
This morning I've opened the object file in a hex editor because of other reasons and I was expecting to find the recursion call instruction in the binary: "FFF7 F6FF". The shock came when I didn't find it. I've found something else instead.
You can see that the instruction words shown by IDA (also in its hex-view) are different from the instruction words shown in the hex editor.
WHHHHAAAATTTT????
The real values show that the PC relative call instruction "bl" is transferring execution to address PC-4 which as I wrote before means endless loop since the PC is always 4 bytes ahead on ARM. But this is not the point.
The point is that IDA is not showing the original contents of the disassembled file but it seems that it does disassembly, relocation and then it assembles back the code and it is the one which is shown in its hex view.
I guess it is by design, but I hope that Ilfak and co. will comment because it is misleading. Pretty much.
So kids, the moral lesson is for today is don't believe to anyone. Even not to this nice lady.
UPDATE!
According to IDA gurus Ilfak and Igor this is a feature and this is the behavior which the IDA manual presents (sic). I feel that this is still misleading because the original code is mixed together with changes made by IDA with no reference.
So, know your tools!