Tuesday 18 December 2012

Groovy DSL: Executing scripts in a sandbox

In my previous post I talked about how a script can be converted to a closure, and how this decouples the DSL implementation from the GroovyShell. The DSL can be used directly from code, but also from an external source, without any changes to the DSL implementation. In this post I will talk about running a script in a sandbox.

Untrusted source

An external source can also mean an untrusted source. For example, when storing scripts in a database on a server, and allowing clients to write scripts that are to be executed on the server, the scripts are able to bring the whole server down with a simple System.exit(0). Or worse.

This is possible because the DSL allows any Groovy code to be run, not just the API provided by us. What we actually want in our case is the script to only use our API, nothing else. And in other cases - depending on the type of script - a few standard JDK or GDK API's. Or at least only the 'safe' ones.

For the latter case, Java already provides a SecurityManager, and Groovy has full support for it. For the first case, Groovy AST Transformations may help us out. Let's try the SecurityManager approach, because it can be more generally applied.

Java Security

To run a Java program on a JVM in a sandbox, the Java SecurityManager must be enabled. This can be achieved by passing -Djava.security.manager when starting the program, or by setting it from within the program with System.setSecurityManager(new SecurityManager()). This will also use the default Policy implementation, PolicyFile, that reads all permissions to be granted to code bases from policy configuration files.
We can immediately see two drawbacks here. The first is that the security manager must be set globally, and the second is that all permissions are read from a file.

Security manager

Unless we spawn a new Java process to run our script, our application has to be running under the security manager too. Spawning a new process is not really an option, because this makes it much harder to integrate the DSL into our application. Setting a global security manager is no problem if all of our code is granted all permissions, and the script code is granted only the permissions we permit. This can be done using policy files if the default implementation is used. A custom security manager is also an option, but it's not necessary to replace the whole security manager. A custom policy implementation is enough, because it's the reference policy implementation that requires the policy files.

Policy files

If you are lucky, you already run your application with the security manager enabled, and have the policy file(s) sorted out. Otherwise, you're going to have to create a policy file right now (or skip to the next section).

The default policy file does not correspond to the situation where no security manager is set. When the security manager is set, our code is running in a restricted environment, i.e. a sandbox.

To restore the situation back to what it was, we could create a policy file in which all permissions are granted to all code:

grant { permission java.security.AllPermission; };

But then we'd have another problem, because there is no way to add any exceptions to this rule: Permission assignment is additive. It works this way for simplicity and performance reasons.

Still, if we were not running a security manager in the first place, we already had all permissions, and now we just want to grant minimal permissions for our scripts. 

As mentioned earlier, for our script we want to add a grant entry with the script code base that specifies only permissions we permit. The "/groovy/shell" code base is the default code base used by GroovyShell if the script is supplied as a String or a Reader ("/groovy/script" if GroovyClassLoader is used directly), so unless we pass it a custom code base, the entry should be:

grant codeBase "file:/groovy/shell" { };

No permissions are listed here, so this will fully restrict our script.

Unfortunately, this entry would be superseded by the earlier entry that gives all permissions to all code bases (which includes the script code base):

To get around this, we have to specify all code bases except the script code base when granting all permissions. This can be a cumbersome task, because the code bases can be different under different circumstances (development, testing, production, etc.). And these places are managed elsewhere, which means they have to be kept in sync somehow.

It's obvious that Java Security was not designed with scripting in mind. But they did make it extensible...

Custom Policy

There is a way around policy files, and that is by not using the reference implementation. This means we have to write a custom policy provider that fits our needs. The standard way of specifying the policy provider is by setting the policy.provider value in the java.security file to the fully qualified class name. But since we don't want to change any external files, we want to set the policy at run-time, which is possible through Policy.setPolicy.

Of course, this would replace the default policy completely for the whole application, so I assume here you are not yet using a security manager. If you are, just add the grant entry for the script to your policy file and you're done.

So now that we know how to replace the default policy with our own, we'll have to implement it. We are required to subclass the abstract class java.security.Policy. We only need to override one method:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
public final class ScriptPolicy extends Policy {
    private final Set<URL> locations;

    public ScriptPolicy() {
        try {
            locations = new HashSet<URL>();
            locations.add(new URL("file", "", "/groovy/shell"));
            locations.add(new URL("file", "", "/groovy/script"));
        } catch (MalformedURLException e) {
            throw new IllegalStateException(e);
        }
    }

    @Override
    public PermissionCollection getPermissions(CodeSource codesource) {
        PermissionCollection perms = new Permissions();
        if (!locations.contains(codesource.getLocation())) {
            perms.add(new AllPermission());
        }
        return perms;
    }
}

Note that I made it a Java class, not a Groovy class (but it could be).

At line 7 and 8 the URL's for the default code base used by GroovyShell and GroovyClassLoader are added to a set.

At line 16 a new Permissions instance is created, to which an AllPermission instance is added if the supplied codesource URL is not equal to any of the restricted script locations.

Testing

Let's create a unit test (JUnit 4) for the Calculator class from my previous post to our sandbox. We start with a simple test of the current functionality, which should still work as normal:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
class CalculatorSandboxTests {
    @BeforeClass
    static void setUpSecurity() {
        Policy.setPolicy(new ScriptPolicy())
        if (System.getSecurityManager() == null) {
            System.setSecurityManager(new SecurityManager())
        }
    }

    def runInContext(Object context, String script) {
        Closure cl = (Closure) new GroovyShell().evaluate("{->$script}")
        cl.delegate = context
        cl.resolveStrategy = Closure.DELEGATE_FIRST
        cl()
    }

    private calculate(String script) {
        def calculator = new Calculator()
        runInContext(calculator, script)
    }

    @Test
    void testSimple() {
        assert 5.0 == calculate("hypotenuse(3.0, 4.0)")
    }
}

At line 4 we set our custom policy. This must be set before we set the security manager, else the security manager will start with full restrictions and deny a call to Policy.setPolicy.

A simple test is specified at line 23.

If we run this test it passes. So far so good. Now let's add a test that does something that needs permission:

@Test
void testPropertyPermissionInScript() {
    assert 49.0 < calculate("System.getProperty('java.class.version') as double")
}

This test fails:

java.security.AccessControlException: access denied ("java.util.PropertyPermission" "java.class.version" "read")

So our policy seems to work! Let's change the test to make it pass:

@Test(expected = AccessControlException)
void testPropertyPermissionDeniedInScript() {
    // access denied ("java.util.PropertyPermission" "java.class.version" "read")
    assert 49.0 < calculate("System.getProperty('java.class.version') as double")
}

Now what if we really need to allow the script to read this property? We could change our policy to add this permission for scripts. But a better idea is to accomodate the DSL for it:

class Calculator {
    double hypotenuse(double width, double height) {
        Math.sqrt(width * width + height * height)
    }

    double javaClassVersion() {
        System.getProperty("java.class.version") as double
    }
}

Now what will happen if we call the new method from the script? Let's test it:

@Test
void testPropertyPermissionInCalculator() {
    assert 49.0 < calculate("javaClassVersion()")
}

If we run this test it will fail. This is because each class associated with a method on the call stack should have the required permission, and since javaClassVersion() is called from the script, access is denied.

To make this work we must wrap it in a call to AccessController.doPrivileged:

double javaClassVersion() {
    AccessController.doPrivileged({
        System.getProperty("java.class.version") as double
    } as PrivilegedAction)
}

This will make the access controller stop looking further on the stack.

The test now passes, and we have a working sandbox in which scripts are fully restricted and the DSL implementation and the rest of the application unrestricted.

Custom code base

When we want to use this sandbox inside larger applications, we should not affect any other GroovyShell or GroovyClassLoader usages, but only restrict our own scripts. This can be done by specifying our own code base to GroovyShell.evaluate, for example "/groovy/myscript":

def runInContext(String script, context) {
    Closure cl = (Closure) new GroovyShell().evaluate(
            "{->$script}", "Script.groovy", "/groovy/myscript")
    cl.delegate = context
    cl.resolveStrategy = Closure.DELEGATE_FIRST
    cl()
}

We then need to change our ScriptPolicy class to allow custom code bases:

public final class ScriptPolicy extends Policy {
    private final Set<URL> locations = new HashSet<URL>();

    public ScriptPolicy() {
        try {
            addRestrictedCodeBase("/groovy/shell");
            addRestrictedCodeBase("/groovy/script");
        } catch (MalformedURLException e) {
            throw new IllegalStateException(e);
        }
    }
    
    public ScriptPolicy(String... codeBases) throws MalformedURLException {
        for (String codeBase : codeBases) {
            addRestrictedCodeBase(codeBase);
        }
    }

    public void addRestrictedCodeBase(String codeBase)
            throws MalformedURLException {
        addRestrictedLocation(new URL("file", "", codeBase));
    }

    public void addRestrictedLocation(URL location) {
        locations.add(location);
    }

    @Override
    public PermissionCollection getPermissions(CodeSource codesource) {
        PermissionCollection perms = new Permissions();
        if (!locations.contains(codesource.getLocation())) {
            perms.add(new AllPermission());
        }
        return perms;
    }
}

And pass the custom code base to our policy:

Policy.setPolicy(new ScriptPolicy("/groovy/myscript"))
if (System.getSecurityManager() == null) {
    System.setSecurityManager(new SecurityManager())
}

Friday 7 December 2012

Groovy DSL: Executing scripts within a context

When we design a domain specific language, it will likely be used in scripts that are stored in files or a database. The scripts also need to be executed inside some context that defines your DSL. There are several ways to define this context. Let's illustrate this using a very simple calculator DSL that provides a function to calculate the hypotenuse from the width and the height:

class Calculator {
    double hypotenuse(double width, double height) {
        Math.sqrt(width * width + height * height)
    }
}

We want our Calculator to be decoupled from the way it is used, so nothing in this class indicates that it will be used from a script.

We can execute scripts by using the GroovyShell class, which as an evaluate method that parses a script and runs it.

Let's look at how to use Calculator implicitly from a script.

Binding

The first way is by passing variables in a groovy.lang.Binding, which the shell implicitly exposes to the script:

def calculator = new Calculator()
def binding = new Binding(hypotenuse: calculator.&hypotenuse)
def shell = new GroovyShell(binding)
assert 5.0 == shell.evaluate("hypotenuse(3.0, 4.0)")

The drawback is that we have to set each method explicitly as a closure inside the binding. We can do this automatically for all methods of Calculator:

def calculator = new Calculator()
def binding = new Binding()
calculator.metaClass.methods.each { method ->
    def name = method.name
    binding.setVariable(name, InvokerHelper.getMethodPointer(calculator, name))
}
def shell = new GroovyShell(binding)
assert 5.0 == shell.evaluate("hypotenuse(3.0, 4.0)")

But somehow it feels like there should be a better way.

Script subclass

The second way is by subclassing groovy.lang.Script, which implements our DSL by defining the DSL methods. As we already have them implemented inside the Calculator class, we can use the groovy.lang.Delegate annotation on a Calculator instance:

abstract class CalculatorScript extends Script {
    @Delegate
    final Calculator calculator = new Calculator()
}

This makes all public instance methods (not properties!) of Calculator available within CalculatorScript, and thus within the DSL, because the DSL script will be part of CalculatorScript.

The code becomes:

def config = new CompilerConfiguration()
config.scriptBaseClass = CalculatorScript.name
def shell = new GroovyShell(config)
assert 5.0 == shell.evaluate("hypotenuse(3.0, 4.0)")

The drawback is that this script class cannot be instantiated by ourselves, so any context for our Calculator cannot be passed directly to it. Our Calculator is stand-alone, but a real-world DSL probably isn't. A possibility is to parse the script into the script instance, pass our context to it, and then run the script:

def config = new CompilerConfiguration()
config.scriptBaseClass = CalculatorScript.name
def shell = new GroovyShell(config)
def script = shell.parse("hypotenuse(3.0, 4.0)")
// script.calculator.context = ...
assert 5.0 == script.run()

Still, this does not feel right, because the Script class is really an implementation detail that we don't want to be concerned about. And it's likely the case that our DSL contains nested contexts, which will be implemented differently, namely through a delegate set on a Closure.

Closure

What we want is to implement the top level DSL context in the same way as nested contexts: Through a closure delegate, like this:

def calculate(Closure script) {
    Calculator calculator = new Calculator()
    calculator.with(script)
}

The 'with' method is available for each Object and sets it as a delegate of the closure and executes it.

assert 5.0 == calculate {
    hypotenuse(3.0, 4.0)
}

How can we achieve this? We need a closure, but our script is a String. Somehow we have to convert the string to a closure and pass it to calculate().

What we can do is to force the DSL user to wrap his script inside a call to a method that accepts a closure. The script itself will look like this:

calculate {
    hypotenuse(3.0, 4.0)
}

We need to define the calculate method in the script class or set it as a closure in the binding.

def calculator = new Calculator()
def binding = new Binding()
binding.setVariable('calculate') { script ->
    calculator.with(script)
}
def shell = new GroovyShell(binding)
assert 5.0 == shell.evaluate("calculate { hypotenuse(3.0, 4.0) }")

But the user has to wrap each of his scripts inside a call to calculate. We can do this for him though, as we will see next.

Solution

We can wrap the script inside a 'calculate' call ourselves, but why not get rid of the calculate method too? We can achieve our goal of converting the script into a closure by wrapping it as a closure like this:

Closure convertToClosure(String script) {
    (Closure) new GroovyShell().evaluate("return {$script}")
}

The result of evaluate is a Closure. The return statement is used to disambiguate between a code block and a closure. Now we can call the script by calling the closure:

def calculator = new Calculator()
def script = convertToClosure("hypotenuse(3.0, 4.0)")
assert 5.0 == calculator.with(script)

The only requirement is that the closure should accept a parameter, as the delegate is also passed as a parameter by the 'with' method. If this is undesirable, we can just run the closure ourselves:

def runInContext(Object context, String script) {
    Closure cl = (Closure) new GroovyShell().evaluate("{->$script}")
    cl.delegate = context
    cl.resolveStrategy = Closure.DELEGATE_FIRST
    cl()
}

The closure doesn't need to be cloned as we have just instantiated it ourselves. Because we don't pass any parameters to the closure, our wrapper can be defined to have no parameters, so we don't expose the 'it' variable to our script.

The calculate method becomes:

def calculate(String script) {
    def calculator = new Calculator()
    runInContext(calculator, script)
}

Now we can execute our script as a string in a Calculator context, by calling calculate:

assert 5.0 == calculate("hypotenuse(3.0, 4.0)")

Another advantage to the Binding and Script base class solutions is that properties inside Calculator are also available inside the script. And Calculator can implement dynamic properties and methods with propertyMissing/methodMissing or getProperty/setProperty/invokeMethod, just like normal Groovy builders.

Tuesday 20 November 2012

Grails: Unit testing of domain classes

To unit test a domain class with Grails 2.x, we can use the TestFor annotation. Within a test method, a new instance of our domain class will be available as 'domain'. We can set properties on the domain instance and then call validate(). The errors property can be used to check the expected validation errors (if any):

class Book {
    String title

    static constraints = { title minSize: 10 }
}

@TestFor(Book)
class BookTests {
    void testShortTitle() {
        domain.title = 'too short'
        assert !domain.validate()
        assert 'minSize.notmet' == domain.errors.getFieldError('title').code
    }
}

It would be nice if the last assert would allow the shorthand syntax to lookup the title field error code (see also GRAILS-8415):

assert 'minSize' == domain.errors.title

However, that will not work:

--Output from testShortTitle--
| Failure:  testShortTitle(bookstore.BookTests)
|  groovy.lang.MissingPropertyException: No such property: title for class: org.grails.datastore.mapping.validation.ValidationErrors

Why? The errors property is of type org.grails.datastore.mapping.validation.ValidationErrors (or grails.validation.ValidationErrors). This type does support the subscript operator though, so it can be slightly shorter:

assert 'minSize.notmet' == domain.errors['title'].code

But it still needs to read the code property of FieldError, and this code does not match the constraint name.

If we look a bit further at other subclasses of org.springframework.validation.BeanPropertyBindingResult, we see that the shorthand syntax is available in org.codehaus.groovy.grails.plugins.testing.GrailsMockErrors, and it translates the codes to constraint names, so 'minSize' can be used instead of 'minSize.notmet'.

To use GrailsMockErrors, we can call mockForConstraintsTests from our test class. For example, we can call it once for the domain class to mock all instances like this:

@TestFor(Book)
class BookTests {
    @Before
    void setUp() {
        mockForConstraintsTests(Book)
    }
    void testShortTitle() {
        domain.title = 'too short'
        assert !domain.validate()
        assert 'minSize' == domain.errors.title
    }
}

The mocking will prepare the Book class, including the errors field, and make the shorthand syntax available.

It's also possible to instantiate our own Book instances instead of using "domain", and test them in the same way. The advantage is that we can use the property map constructor, although it also helps to use the 'with' method on the domain field:

@TestFor(Book)
class BookTests {
    @Before
    void setUp() {
        mockForConstraintsTests(Book)
    }
    void testShortTitle() {
        domain.with {
            title = 'too short'
            assert !validate()
            assert 'minSize' == errors.title
        }
    }
}