Monday, May 22, 2017

Creating images for legacy PC VGA display


Recently I came across the problem of creating bitmap images for VGA color display: the famous int 10h - 13h mode. Don't ask me why I needed it - I have used DOSBox for emulation. The question was how do I create images for this display more in a way that the pixels can be easily copied directly from the file into the display memory. This means two things:

  1. No image compression
  2. Image needs to be fit to VGA 256 palette 
  3. Maybe resizing
Since I haven't found resources I took the "challenge" and write you how I did it. Though it is not rocket-science it might help you save some precious time.

You are going to need GIMP. I'm sure it can be done with PS, but GIMP is open source ;-)

Step one

Take any image you want. Of course, you need to keep in mind the limitations of VGA mode. Most likely you will need to downscale the image and most of the colors will be lost.



Step two

Install VGA palette to GIMP. It is not included in GIMP by default. I took the palette definition file from here. Thanks to icebreaker (Szabolcs Mihali).

You need to download the "VGA.gpl" file from the link above and import it to GIMP. I have just copied it to the palette folder of your GIMP installation. On my Windows installation it was "c:\Program Files\GIMP 2\share\gimp\2.0\palettes". But it is just me and I tend to do shortcuts when talking about GUI. 



The proper way to import the palette is to open palettes dialog by going to Windows -> Dockable Dialogs -> Palettes. When the dialog is open right click in the list of palettes and select Import from the context menu. Then navigate to the gpl file and select it. Voila!

Step 3


Convert image to VGA mode. Select Image -> Mode -> Indexed... in the menu. It will open the Indexed Color Conversion dialog in GIMP. You have to select the custom palette option from the colormap radio buttons. 


Important not to use the "Remove unused colors from colormap" feature. It will break the original goal of being able to copy the contents directly to the VGA memory because if remaps the color values.

Hit Convert. After this your image might seem less nice because of the downgrading of colors.


Step 4


Here we might need to resize the image. Open Image -> Scale image...

I have chosen to fit the image to the 320x200 pixels of the VGA display but you can resize the image as you please. 


You can hit "Scale" and your image will probably look smaller :)


Step 5


Exporting to BMP format. BMP raw is a very simple image format where the pixels are kept "as is" in the file. We are exporting the image to this format by clicking File -> Export As...


Here you have to give the file ".bmp" extension like "ibm_at.bmp". After it hit "Export" and you will get the following dialog. I select "Do not write color space information" because it is not needed for this application and it saves memory.




Now you have a BMP image whose pixel data can directly copied into the VGA memory. It looks like this in DOSBox:


The following post had helped me implementing the parsing of the BMP file.

Caveat: I do see that in some cases GIMP writes the last pixel of a line in four bytes instead of one. This can cause problems in some image parsers. You can try to resize the image to another size an the problem disappears for me. 




Sunday, May 4, 2014

Leaving Windows Metro API jail

I'm still playing in the WoA (Windows on ARM) realm these days. I was porting a nice test framework library which enables remote execution of the tested piece of code. Then it turned out that native Windows socket API is hidden from Metro applications...

The SDK of Windows Phone 8 let's you use them but the Windows RT devices not.  Microsoft suggests using .NET socket libraries. I don't want to get into bit politics but this is quite annoying.

So I've asked myself what the poor bit-welder could do?! I guessed that winsock DLL is still around in the system somewhere. Why would they throw it out?! This guess turned out to be a good one. The problem was how to load the DLL and how to find the addresses in the memory. In other words I needed LoadLibrary and GetProcAddress. But of course they are also hidden, otherwise this API jail wouldn't mean too much.

To break these restrictions I have wrote a mini library. It enables the code to get the original LoadLibrary and GetProcAddress and from there everything is working as like a charm :)

The library parses the Windows internal structures (which are haven't really changed from win32) . It gets from the Thread Environment Block (TEB) to Process Environment Block (PEB). The PEB contains a list of loaded modules which of course contains kernel32.dll. From there we only need to parse the export table of kernel32.dll to get to LoadLibrary and GetProcAddress.

Simpler then it sounds :)

Grab the source!



Monday, March 10, 2014

Lady IDA took me to a trip

No doubt. For all of us who's been involved in reversing software the de facto standard tool is the Interactive Disassembler (IDA).  It is not just good for reversing but also sometimes good for finding complex problems. It is like heavy artillery. It gets through everything.

6348

Having said that this week Lady IDA misled me heavily and took from me two precious days. After unfolding the truth I felt that I need to share...

So it goes like this.

Given an object file created from C code. The format is COFF and it contains ARM code (type 0x1c4 to be precise). The object file contains a single function called "fibo"  which is recursive (strange, huh?! :)).

When linked this object file to a DLL and ran that on Windows Phone I saw that it got to an endless loop in the "bl" instruction (roughly the parallel of call in x86) because it was jumping to the same address in the memory where the instruction is located. I was looking for the bad guy in the building chain.  I've opened the object file in IDA and it showed this.

fibo_ida_1


You can see that at address 0x10 in the text section there is the recursive call (bl _fibo) which is fine and seems at first glance that the object file is ok. Even before this week I knew that IDA takes sometimes the information of the branch/jump target from the relocation table of the object file with no regard to the address which was encoded as a literal in the binary instruction. Therefore I took the double instruction word (Thumb 2) from IDA for the bl instruction and using the ARM reference manual decoded it. As show in the screenshot above the instruction word in the binary is "FFF7 F6FF" which means "bl -20" or jump back 20 bytes. Now we know that on ARM CPU's the PC register is 4 bytes ahead always, which means in this case actually the instruction causes the CPU to jump back 16 bytes or 0x10.


This is good since this instruction causes the CPU to jump to the beginning of the function which is exactly what we want and what IDA shows. Everything is fine with the object file so the problem must be either in the linker or the loader.


Here I spent 2 days of trying to find out what went wrong in the linker...


This morning I've opened the object file in a hex editor because of other reasons and I was expecting to find the recursion call instruction in the binary: "FFF7 F6FF". The shock came when I didn't find it. I've found something else instead.


fibo_ida_2


You can see that the instruction words shown by IDA (also in its hex-view) are different from the instruction words shown in the hex editor.


WHHHHAAAATTTT????


ida_meme


The real values show that the PC relative call instruction "bl" is transferring execution to address PC-4 which as I wrote before means endless loop since the PC is always 4 bytes ahead on ARM. But this is not the point.


The point is that IDA is not showing the original contents of the disassembled file but it seems that it does disassembly, relocation and then it assembles back the code and it is the one which is shown in its hex view.


I guess it is by design, but I hope that Ilfak and co. will comment because it is misleading. Pretty much.


So kids, the moral lesson is for today is don't believe to anyone. Even not to this nice lady.



UPDATE!


According to IDA gurus Ilfak and Igor this is a feature and this is the behavior which the IDA manual presents (sic). I feel that this is still misleading because the original code is mixed together with changes made by IDA with no reference.

So, know your tools!

Monday, September 16, 2013

Don't wait for the signal

I have been looking into a crazy issue in libcurl. An Android application was crashing and according to the crash report the reason was a bad pointer. That's not interesting at all and I wouldn't post about it if the circumstances hadn't made me ponder about the deepness of the hole we get into sometimes.

The back trace was nearly equivalent to what can be seen here.
backtrace:
#00 pc 00626916 /data/app-lib/com.pmon.app1-1/libvgc.so
#01 pc 00626bb3 /data/app-lib/com.pmon.app1-1/libvgc.so
#02 pc 0062792d /data/app-lib/com.pmon.app1-1/libvgc.so (curl_mvsnprintf+20)
#03 pc 00621c01 /data/app-lib/com.pmon.app1-1/libvgc.so (Curl_failf+24)
#04 pc 0061c0cd /data/app-lib/com.pmon.app1-1/libvgc.so (Curl_resolv_timeout+188)

So the crashed thread id was the same as the process id, meaning the main thread crashed. Now the interesting part was that libcurl is not used from the application's main thread. For sure...

How comes that the crashed thread (again, the main) was running libcurl code at the time of the crash?

Sure you think, well, the stack was corrupted. Or there was something wrong with the crash report. No no, everything was fine. The stack frames seemed to be in order, everything was OK except that last pointer which contained an address to no man's land.

The question was what could be the reason to see a thread running a code which it shouldn't. It took me a while to remember that Linux signals may do this trick. I mean if libcurl (and this theory has been proved later) is registering its code for signal handling then yes, some of the signal handlers will indeed run from the main thread of the process!

See this for more info on the subject.

Just to complete the story, the reason for the crash was a canceled request caused a timeout. Some of the related data containers were already released by the time of the timeout so the signal handler went to play on the minefield. What a pity... :-/

To disable the use of signals in curl you can do two things:

  • Use the "CURLOPT_NOSIGNAL" option when initializing curl. Keep in mind that this will disable timeouts automatically (less idea for most applications)

  • Compile curl with threaded-resolver. It requires pthread support in the toolchain and the target system.


 

 

 

 

Thursday, September 12, 2013

Don't you dare to deny my morning COFFee, PEople

This week I have been involved in a courageous project related Microsoft's linker for WP8 (part of VS2012). It required some knowledge of PE/COFF formats which I didn't have before and I have decided to learn these formats through wrinting a tool for reading this file the way we have "readelf" for ELF and "otool" for Mach-O formats.

Side-note: understanding PE/COFF after knowing ELF and Mach-O is a piece of cake. Most of the differences are merely syntactic...

So the product is a python parser of PE/COFF files. It works just as readelf:
c:\Users\pmon\Desktop\petools>readcoff.py -h winp8.o
COFF Header:
Machine: 0x01c4
Number of sections: 0x0006
Time of creation: 2013-09-08 12:44:38
Pointer to symbol table: 0x00003eff
Number of symbols: 0x0000007b
Size of optional headers: 0x0000
Characteristics: 0x0000

The thing is not complete but you can use it to investigate object files.

It can be taken from:

https://bitbucket.org/pmon/petools