Sunday, May 4, 2014
Leaving Windows Metro API jail
The SDK of Windows Phone 8 let's you use them but the Windows RT devices not. Microsoft suggests using .NET socket libraries. I don't want to get into bit politics but this is quite annoying.
So I've asked myself what the poor bit-welder could do?! I guessed that winsock DLL is still around in the system somewhere. Why would they throw it out?! This guess turned out to be a good one. The problem was how to load the DLL and how to find the addresses in the memory. In other words I needed LoadLibrary and GetProcAddress. But of course they are also hidden, otherwise this API jail wouldn't mean too much.
To break these restrictions I have wrote a mini library. It enables the code to get the original LoadLibrary and GetProcAddress and from there everything is working as like a charm :)
The library parses the Windows internal structures (which are haven't really changed from win32) . It gets from the Thread Environment Block (TEB) to Process Environment Block (PEB). The PEB contains a list of loaded modules which of course contains kernel32.dll. From there we only need to parse the export table of kernel32.dll to get to LoadLibrary and GetProcAddress.
Simpler then it sounds :)
Grab the source!
Monday, March 10, 2014
Lady IDA took me to a trip

Having said that this week Lady IDA misled me heavily and took from me two precious days. After unfolding the truth I felt that I need to share...
So it goes like this.
Given an object file created from C code. The format is COFF and it contains ARM code (type 0x1c4 to be precise). The object file contains a single function called "fibo" which is recursive (strange, huh?! :)).
When linked this object file to a DLL and ran that on Windows Phone I saw that it got to an endless loop in the "bl" instruction (roughly the parallel of call in x86) because it was jumping to the same address in the memory where the instruction is located. I was looking for the bad guy in the building chain. I've opened the object file in IDA and it showed this.
You can see that at address 0x10 in the text section there is the recursive call (bl _fibo) which is fine and seems at first glance that the object file is ok. Even before this week I knew that IDA takes sometimes the information of the branch/jump target from the relocation table of the object file with no regard to the address which was encoded as a literal in the binary instruction. Therefore I took the double instruction word (Thumb 2) from IDA for the bl instruction and using the ARM reference manual decoded it. As show in the screenshot above the instruction word in the binary is "FFF7 F6FF" which means "bl -20" or jump back 20 bytes. Now we know that on ARM CPU's the PC register is 4 bytes ahead always, which means in this case actually the instruction causes the CPU to jump back 16 bytes or 0x10.
This is good since this instruction causes the CPU to jump to the beginning of the function which is exactly what we want and what IDA shows. Everything is fine with the object file so the problem must be either in the linker or the loader.
Here I spent 2 days of trying to find out what went wrong in the linker...
This morning I've opened the object file in a hex editor because of other reasons and I was expecting to find the recursion call instruction in the binary: "FFF7 F6FF". The shock came when I didn't find it. I've found something else instead.
You can see that the instruction words shown by IDA (also in its hex-view) are different from the instruction words shown in the hex editor.
WHHHHAAAATTTT????
The real values show that the PC relative call instruction "bl" is transferring execution to address PC-4 which as I wrote before means endless loop since the PC is always 4 bytes ahead on ARM. But this is not the point.
The point is that IDA is not showing the original contents of the disassembled file but it seems that it does disassembly, relocation and then it assembles back the code and it is the one which is shown in its hex view.
I guess it is by design, but I hope that Ilfak and co. will comment because it is misleading. Pretty much.
So kids, the moral lesson is for today is don't believe to anyone. Even not to this nice lady.
UPDATE!
According to IDA gurus Ilfak and Igor this is a feature and this is the behavior which the IDA manual presents (sic). I feel that this is still misleading because the original code is mixed together with changes made by IDA with no reference.
So, know your tools!
Monday, September 16, 2013
Don't wait for the signal
The back trace was nearly equivalent to what can be seen here.
backtrace:
#00 pc 00626916 /data/app-lib/com.pmon.app1-1/libvgc.so
#01 pc 00626bb3 /data/app-lib/com.pmon.app1-1/libvgc.so
#02 pc 0062792d /data/app-lib/com.pmon.app1-1/libvgc.so (curl_mvsnprintf+20)
#03 pc 00621c01 /data/app-lib/com.pmon.app1-1/libvgc.so (Curl_failf+24)
#04 pc 0061c0cd /data/app-lib/com.pmon.app1-1/libvgc.so (Curl_resolv_timeout+188)
So the crashed thread id was the same as the process id, meaning the main thread crashed. Now the interesting part was that libcurl is not used from the application's main thread. For sure...
How comes that the crashed thread (again, the main) was running libcurl code at the time of the crash?
Sure you think, well, the stack was corrupted. Or there was something wrong with the crash report. No no, everything was fine. The stack frames seemed to be in order, everything was OK except that last pointer which contained an address to no man's land.
The question was what could be the reason to see a thread running a code which it shouldn't. It took me a while to remember that Linux signals may do this trick. I mean if libcurl (and this theory has been proved later) is registering its code for signal handling then yes, some of the signal handlers will indeed run from the main thread of the process!
See this for more info on the subject.
Just to complete the story, the reason for the crash was a canceled request caused a timeout. Some of the related data containers were already released by the time of the timeout so the signal handler went to play on the minefield. What a pity... :-/
To disable the use of signals in curl you can do two things:
- Use the "CURLOPT_NOSIGNAL" option when initializing curl. Keep in mind that this will disable timeouts automatically (less idea for most applications)
- Compile curl with threaded-resolver. It requires pthread support in the toolchain and the target system.
Thursday, September 12, 2013
Don't you dare to deny my morning COFFee, PEople
Side-note: understanding PE/COFF after knowing ELF and Mach-O is a piece of cake. Most of the differences are merely syntactic...
So the product is a python parser of PE/COFF files. It works just as readelf:
c:\Users\pmon\Desktop\petools>readcoff.py -h winp8.o
COFF Header:
Machine: 0x01c4
Number of sections: 0x0006
Time of creation: 2013-09-08 12:44:38
Pointer to symbol table: 0x00003eff
Number of symbols: 0x0000007b
Size of optional headers: 0x0000
Characteristics: 0x0000
The thing is not complete but you can use it to investigate object files.
It can be taken from:
https://bitbucket.org/pmon/petools