# HITB PRO CTF - cells

The cells service allows users to save their secret data in “DNA of cells”.

The service’s frontpage provides a login and registration form, as well as a search function:

Via the search function, we can also search for users matching a pattern and download their cells.

The webinterface is implemented as a python flask service, however, the store and check endpoints offload their main functionality to an ELF-binary simply called cells.elf.

## The Flagstore

Unsurprisingly, flags are stored in other users’ cells. However, without a password, these cell-files seemingly only contain binary garbage. Thus, in order to get some flags, we either need to obtain a user’s password, or find some other way to recover the flag.

## A custom VM

After spending some hours with Ghidra, we find that cells.elf actually implements a custom virtual machine. Storing a secret is implemented as generating a program which will output the flag and a checksum when given the corresponding password as input. Checking a secret is then simply executing the generated program with the provided password and verifying the checksum. This also explains the content of the downloadable cell-files: The presumed binary garbage is actually VM bytecode.

For example, in one cell the gameserver stored the flag OADJUIS7KNBFLTNY4P5MZFEE0EDC381= using the password U0VSegNW, which resulted in the following cell:

And indeed, executing this cell returns the flag plus a hex-encoded checksum separated by a comma:

\$ ./cells.elf cell.pk.cell U0VSegNW


## VM internals

The VM uses 3 bytes to encode an instruction. Bits 21-18 denote the opcode, and bits 18-0 encode the arguments: three six-bit arguments for comparison operators, two eight-bit arguments for arithmetic instructions, or a single 16- or 18-bit argument (I/O and jump instructions). In most cases, arguments are interpreted relative to the program counter. In addition, the two highest bits have special functions: Bit 23 (restart) signals that after this instruction, the program counter shall be reset to zero, while bit 22 (remove) indicates that the instruction should be removed from the program once it has been executed.

Execution operates on an array structure, where each slot stores a 4-byte value. The array is initialized by loading 3-byte chunks from the input cells file. During execution, the slot pointed to by the program counter will be decoded as an instruction, executed, and the result written back into the current slot, if a new value was generated. Execution terminates, once the current slot cannot be decoded as a valid instruction, on a JMP instruction to address 0, or when a given number of steps is exceeded (5000 by default).

In total, the VM has 17 opcodes:

opcode instruction
0   illegal instruction
1 add x y arr[pc] = arr[pc + x] + arr[pc + y]
2 sub x y arr[pc] = arr[pc + x] - arr[pc + y]
3 mul x y arr[pc] = arr[pc + x] * arr[pc + y]
4 div x y arr[pc] = arr[pc + x] / arr[pc + y]
5 rem x y arr[pc] = arr[pc + x] % arr[pc + y]
6 xor x y arr[pc] = arr[pc + x] ^ arr[pc + y]
7 or x y arr[pc] = arr[pc + x] | arr[pc + y]
8 and x y arr[pc] = arr[pc + x] & arr[pc + y]
9 copy imm16 arr[pc] = signext(imm16)
a input imm16 arr[pc + imm16] = getchar() (input is taken from cmdline argument, not stdin)
b output imm16 putchar(arr[pc + signext(imm16)])
c ifzero x y z arr[pc] = arr[x] (presumably should have been arr[pc] = (arr[pc + x] == 0) ? arr[pc + y] : arr[pc + z])
d ifgt x y z arr[pc] = arr[x] (presumably should have been arr[pc] = (arr[pc + x] < 1) ? arr[pc + z] : arr[pc + y])
e ifls x y z arr[pc] = arr[x] (presumably should have been arr[pc] = (arr[pc + x] < 0) ? arr[pc + y] : arr[pc + z])
f skip imm18 pc = pc + imm18
10 skipref imm18 pc = arr[pc + imm18] (cannot be decoded, instruction decoder truncates opcodes to 4 bits)

Given that the generated programs are quite short, the number of opcodes is small, and the password length is limited to eight characters, we did what anyone with a background in writing symbolic execution engines for weird VMs would do: re-implement the VM in python to make a symbolic execution engine.

The idea here was that if we can execute a cell program symbolically, we might have enough information to feed to z3 to recover the flag and password.

To keep things simple (and since it was already way past midnight at that point), we only handled the easy cases and ignored symbolic program counters etc. The implementation is mostly straightforward, though in some cases python integers and z3 BitVecs require individual handling (e.g. integer division):

Given an array of eight symbolic input bytes, this function provides us with an array of output-expressions:

We further know that all input and output bytes are in the printable range:

In addition we can make use of the checksum: The ninth-last output must be a comma, and the remaining eight characters must be hexdigits:

Lastly, we can also recompute the checksum. Fortunately, for this we use the given python code as is, and only need to provide a to_hex function that can handle symbolic values:

From here we should have given z3 enough to work with to recover the password and the flag:

which, indeed, works:

In [1]: with open('cell.pk.cell','rb') as f:
...:

In [2]: solve(cell)

Patching the cells service proved quite difficult: Restricting the search function to patterns longer than two characters is impossible, since the gameserver checks for such short patterns. The only way out is therefore to make the generated cell programs more complicated. For example, to defend against our exploit, it would have been enough to obfuscate the code by XORing it with one of the input bytes. As the resulting expressions would have been symbolic, we could have no longer decoded the instructions correctly. However, during the CTF a overwhelming desire for sleep meant that our cells service remained unpatched.