RISC OS: Introduction to the ARM AIF object file format


In this post we’ll see some details about the ARM Image File Format useful to new RISC OS software developers when coding with Assembly and/or compiled languages.

Disclaimer

This article is by no means exhaustive to the argument it describes. It should be intended as a simple introduction to the argument with the minimal amount of information required to have a general understanding of the AIF format and gain some capacity to debug RISC OS applications in AIF file format.

At the end of the article you can find a reference list with documents containing more details. I did my best to summarise the most important info and make sure they all get tested and verified to the best of my abilities and available time.

Intro

The Arm Image Format (AIF) is a simple object file format used primarily for software intended to run on ARM microprocessors. It was introduced by Acorn Computers Ltd during the early days of ARM for use on the Archimedes, the RiscPC and all the other RISC OS computer ranges. It is still being used on RISC OS based computers nowadays, although if RISC OS now supports also ELF object file format via UnixLib suite. AIF supports debug info and so can optionally help with programs debugging activities on RISC OS.

Although it is possible to create an AIF manually (the format is extremely simple), generally an AIF file is generated by a linker after we tell the linker to take an aof (ARM Object File) format or a binary image in input and generate the AIF file as output.

AIF structure

An AIF file consists of few parts:

  • A 128-bytes header area
  • Binary Image (our code)
  • Image’s initialised static data

An AIF can also be compressed and self decompressing (this to improve loading performances from slow devices), in this case its parts are:

  • Header 128-bytes
  • Compressed Image
  • Decompression data  (this data section appears to be position independent)
  • Decompression code (this code appears to be position independent)

An AIF file layout is composed by:

  • Header
  • A read-only area
  • A read-write area
  • Debugging data (this is optional and populated when asking compilers, assemblers and linkers to add such info to the output file)
  • Self-relocation code (position independent)
  • Relocation list (a list of words to relocate terminated by a -1)

Characteristics of the AIF files

The ARM STD Ref Guide reports that there are 3 types of AIF files:

Executable AIF file

  • This type of AIF is the most common (and it’s basically used for most !RunImage files for RISC OS applications) can be loaded at its load address and entered or executed from there
  • When executed it can relocate itself if required.
  • It can create its own zero-initialized area (using the ZeroInit code subroutine, explained later here).
  • The image header contains code that ensures that the image is setup correctly for execution before being executed at its entry-point.
  • The 4th word of an executable AIF header is always: BL entry-point-address
  • The base address of this AIF is where the AIF header is loaded, while the code address is at base_address + 0x80
  • On RISC OS the base address for our Executable AIF header is always 0x8000 unless it relocates itself.

For Beginners: If you are wondering how it is possible that all applications load to the same address without overwriting each-other, this is because RISC OS has been designed from the beginning around the concept of using an MMU (Memory Management Unit) and Virtual Memory Address space. In other words that 0x8000 is actually a virtual address which gets mapped into different physical addresses in memory by the MMU using memory allocation tables created by RISC OS itself at startup. If I have time I’ll add an article about the details of how this works on ARM and RISC OS.

Non-Executable AIF file

  • This type of AIF needs to be prepared for execution by an image loader.
  • When the image loader has prepared this type of AIF by following the header, the header will be discarded
  • The base address of this type of AIF is the address where it should be loaded

Extended AIF file

  • This type of AIF is a special type of Non-Executable AIF.
  • This type of AIF contains a scatter-loaded image.
  • It has a header that points to a chain of descriptors within the file.

The AIF header

The AIF header, which may also be displayed in some debuggers on RISC OS, has a word (32bit) structure. Having some knowledge of it will help understand what’s going on when we start a debugging session.

The AIF header is generally composed by:

Word Position Brief Desc
0x00 BL DecompressCode | NOP Jump to decompression code section OR No Operation if the AIF is not compressed.
0x04 BL SelfRelocCode  | NOP Jump to subroutine for self relocation OR No Operation if the image is not self-relocating
0x08 BL ZeroInit  | NOP Jump to ZeroInit code subroutine OR No Operation if the image has none
0x0C BL ImageEntryPoint  | EntryPoint offset Jump to EntryPoint for Executable AIF OR EntryPoint offset for Non-Executable AIF. BL is used to make the header addressable via R14 (ARM32 Link Register) in a position independent to ensure the header is position-independent
0x10 Program Exit Instructions to exit the program as last attempt, in RISC OS this is an OS_Exit SWI
0x14 Image ReadOnly size Size of the ReadOnly section, it includes the size of the Header only in the case the AIF is Executable
0x18 Image ReadWrite size Exact size of the ReadWrite section in multiple of 4 bytes
0x1C Image Debug size Exact size of the Debug section in multiple of 4 bytes. Includes high and low level debug size. Bits 0-3 hold the type, bits 4-31 hold the low-level debug size
0x20 Image ZeroInit area size Exact size of the ZeroInit section in multiple of 4 bytes
0x24 Image Debug type Valid values are 0=No debugging data present,1=Low-level debugging data present,2=Src-Level debugging data present,3=1 and 2 present
0x28 Image base Address where the code was linked
0x2C Work Space this was obsoleted in the ’90s
0x30 Address mode this word contains either 0, 26 or 32 in its last significant byte to indicates if the binary image is linked for 26bit, 32bit or, if it’s 0 then that indicate the binary is in an old 26bit header
0x34 Data base address where the image data was linked
0x38 Two reserved words This is for Extended AIF
0x40 DBGInit | NOP Debug Initialisation Instruction OR No Operation if DBGInit is unused
0x44 ZeroInit code 15 words Header is 32 words long

AIF Header details Table

The Binary Image

The binary image is fundamentally our code and, given that most of the compilers available for RISC OS only support static linking, it also contains all the libraries you may have used during the static linking phase.

The Image EntryPoint may also depend on the runtime library used with our code, for example if we used a compiler that links against the SharedCLibrary provided with RISC OS then the AIF header EntryPoint will be the initialisation of the SharedCLibrary that, when done, will call our code’s main function.

The Image EntryPoint for a simple Assembly binary code (for example from the ObjAsm) will be the EntryPoint of our code.

Practical Example

The image here below displays a typical HelloWorld program AIF header (source is ARM ASM) and the Binary Image (Figure generated in the Acorn / ROOL DDT debugger). The beginning of the code is called symbolic disassembly of the run-time system initialisation code.

DDT-HelloWASM-AIFHeader

  • At location 0x8000 (locations are on your left, first window column) we find the first NOP (mind that the NOP instruction gets disassembled as MOV r0,r0 on ARM). This tells us that the executable AIF above is not compressed.
  • The second NOP at 0x8000 + 0x04 tells us that this AIF is NOT self-relocating.
  • The BL at 0x8000 + 0x08 tells us this AIF has a ZeroInit section at 0x8040 which starts with the NOP and then, when the ZeroInit has completed at 0x8070, we load the Link Register (lr) back into the Program counter (pc) this is the same as a “return” instruction on ARM, so we’ll return back to 0x800c.
  • At 0x800c execution will jump to our Binary Image EntryPoint  which has label main (don’t get confused with this label and the C main function entry point, in this case the main label correspond to the ASM directive ENTRY).
  • At 0x8080 our HelloWorld code starts and does its job.
  • At location 0x8030 we can see that the Address mode is set to 0, this is because when I assembled and linked the ASM source, I did not specify any external library and so no APCS (ARM Procedure Call Standard) was set for this Executable AIF.
  • At location 0x8010 we can see the SWI OS_Exit

The DBGInit Instruction

At 0x8040 the Debug Initialisation Instruction is optional and generally this field is left as NOP. However, from the official RISC OS documentation, this field, if used, is expected to be a SWI instruction which should alert a debugger that a debuggable image is starting execution. This however doesn’t seem to be required for debuggers on RISC OS.

The ZeroInit code

This code is generally standard and is added by the Linker when we link our object file generated either by a compiler or an assembler.

It basically is a simplified version of a self-move code and make sure the AIF can be tailored easily to new environments.

Below there is an example with comments extracted from the ARM official docs (for your reference):

        NOP                       ; or <Debug Init Instruction> 
        SUB    ip, lr, pc         ; base+12+[PSR]-(ZeroInit+12+[PSR])
                                  ; = base-ZeroInit
        ADD    ip, pc, ip         ; base-ZeroInit+ZeroInit+16 = base+16
        LDMIB  ip, {r0,r1,r2,r4}  ; various sizes
        SUB    ip, ip, #16        ; image base
        ADD    ip, ip, r0         ; + rO size
        ADD    ip, ip, r1         ; + RW size = base of 0-init area
        MOV    r0, #0
        MOV    r1, #0
        MOV    r2, #0
        MOV    r3, #0
        CMPS   r4, #0
    00  MOVLE  pc, lr             ; nothing left to do
        STMIA  ip!, {r0,r1,r2,r3} ; always zero a multiple of 16 bytes
        SUBS   r4, r4, #16
        B      %B00

Ok, below we’ll give a detailed analysis on the ZeroInit code above as it would have been applied in our example AIF file in the DDT screenshot described previously. That will help the reader to understand zeroInit code as well as to see that even if each linker or compiler release may produce a slightly different zeroInit code, its function stays exactly the same.

1st line is a NOP instruction, remember this is the DBGInit instruction and in this case it’s unused, so NOP.

2nd line the SUB instruction is used to calculate the base address for the Zero Initialised data. The math is relatively simple:

  • the current address in PC (Program Counter, in AArch32 R15, which should contain 0x8044) is subtracted to the current address in LR (Link Register, in AArch32 R14, which in this case should contains 0x800c which is the location with the BL instruction to the Binary Image entry-point address ) and the result is placed in IP (Intra Procedure call scratch Register, in AArch32 R12)
  • At the end of this, in the example code, IP should contain 0xFFFFFFc0 (note this is a negative number!)

3rd Line add the value stored in IP to the value stored in PC (PC has now 0x8048, FYI, while IP still has the value calculated above) and put the result in IP (the  result should be 0x8010, which also explains why on the 2nd line we tried to get a negative value for IP). The value in IP clearly shows the base address which correspond to the last location used by our header for the exit instruction, in our case SWI OS_Exit.

For beginners: Given that an AIF is relocatable it is necessary to calculate base-addresses because the may not be the ones we expect. In this example they are the standard virtual addresses because our AIF did not try to relocate.

On the 4th line we load multiple registers (R0,R1,R2 and R4) with values contained in the memory locations starting with the one pointed by IP + 1 WORD. So, basically from location 0x8014 in our case. To do that we use LDMIB instruction (ASM simple trick, LDMIB LoaD Multiple Increment BEFORE) which will increment the value in IP before using it. This makes sense because, as we have seen before, the value in IP when the 3rd line starts to get executed is the address pointing at the last instruction from the AIF header, so we need to move to the next address after that one to start our zero initialisation.

So, after the above line we have:

  • R0 should contains the value 0x00a0 (which is the size of the ReadOnly area, look at the AIF header details table above, row 0x14, for info)
  • R1 should contains 0x0000 (this is the Image ReadWrite area size, look at the AIF details table above row 0x18 for more info)
  • R2 should contains 0x0298 (this is the Debug are size, look at the AIF details table above for info, row 0x1c), more details on how to get this populated here
  • R4 should contains 0x0000 (This is the Image ZeroInit size, have a look at the AIF details table above, row 0x20, for more info), in our case do not consider this value given that the ZeroInit code we are describing is different than the one used by the executable in the DDT screenshot

On the 5th line we decrement IP of 16 (in some case you may see this value represented at #&10 which is the hexadecimal representation for the decimal 16). Now IP should contain the base address for the AIF Image (in our case 0x8000, which in RISC OS is the standard virtual base address for all WIMP applications executables).

On the 6th line we add the value contained in R0 to the value contained in IP and we store the result in IP. Now IP should contain the AIF image base virtual address + the size of the ReadOnly area, which in our case should be 0x80a0.

On the 7th line we add the value contained in IP with the value contained in R1 and we put the result in IP. Now IP should contain the value of virtual base address of the AIF image + the size of the ReadOnly area + the size of the ReadWrite area, which in our case is still 0x80a0 because our ReadWrite area size was 0. Now this value is the base virtual address of our ZeroInit area.

From line 8 to line 11 we simply set Registers R0,R1,R2 and R3 values to zero (if you are wondering why also R3, that’s because we want to do a zeroInit that is a multiple of 16 when we’ll do the STMIA at line 14).

At line 12 we check if the value in R4 is zero, if it is then CMPS will set flag Z in the CPSR to 1 otherwise it’ll be set to 0. Please note: the S condition at the end of CMP is irrelevant in modern ARM ASM, given that CMP always updates CPSR flags (Current Program Status Register), so the example code is old.

At line 13 we execute a MOVLE (LE stays for Less than or Equal) of LR Register to PC (which basically creates a conditional return instruction that should be read like: if we are done doing the zeroInit lets return to the caller) if R4 is equal or less than 0.

At line 14 we initialise memory addresses from IP pointed one to zero by storing the content of Registers R0,R1,R2,R3 (in a multiple of 16 fashion) and we increment IP value so that at the next round we’ll zero the next 4 locations after the one we initialised now.

At line 15 we subtract 16 from the value contained in R4. Given that we used a SUBS (note the S) the result of the operation will also influence the Flags in CPSR (in this case the S is needed for the MOVLE at line 13 where we’ll jump to on line 16).

At line 16 we jump back to line 13 and we repeat the initialisation process until all the zeroInit area is set to 0 🙂

AIF Header for C developers

The following struct represent an AIF32 header in C, there is a full implemented one in ROOL DDE in C:DDTLib.h.AIFHeader if you want to include it in your own code.

typedef struct {
uint32_t BL_decompress_code;
uint32_t BL_selfreloc_code;
uint32_t BL_zeroinit_code;
uint32_t BL_imageentrypoint;
uint32_t swi_OSExit;
uint32_t size_ro;
uint32_t size_rw;
uint32_t size_debug;
uint32_t size_zeroinit;
uint32_t debug_type;
uint32_t image_base;
uint32_t workspace;
uint32_t reserved[ 4];
uint32_t zeroinitcode[16];
} AIF32HeaderBlock;

Ok that’s it for now, thanks for reading and I hope you’ve found some useful information here. If you enjoyed this post, please don’t forget to support my blog by:

  • Visiting my on-line hacking and engineering merchandise shop on redbubble.com by clicking here
  • Or you can also make a donation with the PayPal link in the column on your right
  • Or share this article

If you like my articles and want to keep getting informed on new ones you can follow me on on of those 21st Century thingies called FacebookTwitterInstagram or Pinterest

And as always if you have any questions please feel free to use the comments section below.

Thank you! 🙂

More references:

One thought on “RISC OS: Introduction to the ARM AIF object file format

  1. Pingback: RISC OS: Using the Acorn / ROOL Desktop Debugging Tool DDT (part 1) | Paolo Fabio Zaino's Blog

Leave a Reply or Ask a Question

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.