Practical Reverse Engineering Exercise Solutions: Page 79 / Exercise 6
Exercise 6 on page 79 of the book Practical Reverse Engineering specifies the following ARM disassembly of a function called mystery6
:
|
|
Due to the presence of 16 bit instructions, instructions having the .W
suffix and function prologue and epilogue with PUSH
and POP
respectively, we are dealing with code in Thumb state.
The function apparently takes two arguments in R0
and R1
, where R0
is a pointer to an unknown structure (line 4, line 9) and R1 is a value which is compared to elements of the unknown structure (line 10). For the return values, it seems both R0
and R1
contain in this case return values, as both are set prior to exiting the function. For this purpose, we consult the official ARM documentation for the Procedure Call Standard http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042f/IHI0042F_aapcs.pdf.
The relevant section for us is the following:
A double-word sized Fundamental Data Type (e.g., long long, double and 64-bit containerized vectors) is returned in r0 and r1.
Thus, the function seems to return a 64 bit value. Our preliminary function prototype is as follows:
|
|
Lines 1-7 set register R2
to zero and examine the value stored at offset 0x0
of the unknown structure. When it is less or equal to zero, the function returns 0. Otherwise, the second block beginning with line 9 is executed.
There, the value stored af R0+0x4
of the unknown structure is loaded and the value of R0
increased by 0x4
afterwards (pre-indexed address mode). This value is compared to the second argument arg2.
In case of inequality, the register value of R2
is incremented by 1. Furthermore, R2
is compared to the value stored at offset 0x0 of the unknown structure. If it greater than or equal to this value, the function returns 0. If it is less than the value at offset 0x0, the control flow begins again at line 8.
In case of equality, the control flow continues at line 20.
Consequently, the program executes a loop and uses register R2
as a counter variable, while the value of the unknown structure at offset 0x0
serves as a limit parameter. At offset 0x4
of the unknown structure, there seems to be a sequence of values that each require 4 bytes of storage. We conclude that it probably is an array of type int. The preliminary structure is as follows:
|
|
With this knowledge in mind, it is highly likely that the element field00_i stores the number of elements in the array. Thus, we give the variables more descriptive names:
|
|
Moreover, the second argument arg2 is compared to the array values and if it cannot be matched to any of the array values, the function returns 0. Thus, the function seems to search after the second argument in the array.
Beginning with line 20, we know that the counter variable R2
contains the array index where the second argument can be found. In line 21, the counter variable R2
is reduced by 0x20 (decimal 32) and stored in R3. Afterwards, a left shift of the value 0x1 by the value stored in R2
is performed and stored in R1
. In mathematical terms, we have: R1 = 0x1 << (R2-0x20)
.
Afterwards, a left shift of the value 0x1
by the counter variable R2
is performed and stored in R0
.
Lastly, the function epilogue is executed and the values from R3, R4 and R11
are restored.
At first glance, the left shifting operations might confuse the uninitiated eye. So what is actually happening here? It might help to recall that the function returns a 64-bit value and the lower 32 bits are stored in R0
and the upper 32 bits are stored in R1
.
Ratherthan outlining the theoretical possibilities, it may make more sense to take some example values for the counter variable. Let’s consider three different cases:
- 0 <= R2 <= 31
- 32 <= R2 <= 63
- 64 <= R2
In the first case, the first shift operation will simply store 0 in R1
, as the substraction of 32 from a value in [0;31]
will result in a shift operation that moves 0x1
beyond what can be represented with a 32 bit value. R0
, however, will contain a single “1” bit exactly at the position defined by R2.
In the second case, the first operation will store a single “1” bit exactly at the position defined by (R2-32
), whereas R0
only contains zeroes.
In the third case, both R0
and R1
will only contain zeroes as both shift operations try to shift the least significant bit 0x1 to beyond what can be represented with a 32 bit value.
In a nutshell, the result value R0:R1
is a bitmap or bitfield, which contains a single “1” bit at the position indicates by the counter variable R2
, which in turn indicates where in the array of the first argument the second argument can be found.
There is, however, an important caveat: When the second argument is located after the 64th element of the input array, the function will return 0.
Lastly, we provide a decompiled function mystery6
:
|
|