Tuesday, July 21, 2009

Immutability, Javari, and @ReadOnly: JSR 308

Behind the curve, as always, I'm only just realizing that Java is in the process of introducing an Official Way to assemble one of the important puzzle pieces for making systems with cooperating untrusted code: namely, immutability.

Enforced immutability makes it easy for you to share a data structure with a piece of code that should be allowed to read, but not write, the structure. There are a number of situations where this is useful. The Javari paper, for example, cites the way Class.getSigners was implemented in early versions of Java:


class Class {
private Object[] signers;
Object[] getSigners() {
return signers;
}
}


Because raw Java does not provide for read-only array elements, and because the method returns the array actually used to implement Java security, this is a security hole: all you need to do to make a class appear to have a different set of signers is alter elements in the returned array.

The conventional Java programming technique to avoid this problem is to make and return a copy of the array, but that adds verbosity to the code, is prone to errors (what if you forget to make a copy in just one place?), and is not efficient (if you invoke the method a lot, you make a lot of array copies).

Javari makes it easy to do something like this:


class Class {
private Object[] signers;
readonly Object[] getSigners() {
return signers;
}
}


which has the effect of prohibiting modification of the returned array elements, so that even untrusted code can invoke the method without ill effects.

JSR 308, coming up in Java 7, allows annotations on types. This extends Java annotations so that they can be used in place of the readonly keyword above, so that Javari can be turned into standard (JSR-308-extended) Java.

And now we come to the end of my expertise on the subject, because I've only just glanced at the main java page's blurb about JSR 308 and barely begun to read the Javari paper.

But I've got a question: is the Javari checker only a compiletime thing, and not a loadtime thing? Compiletime checking is great, of course; the earlier you can catch a potential error, in general, the better. But a compiletime-only mechanism is likely insufficient if you're interested in enforcing rules on untrusted code, because:

  • You may well not have access to the source for the untrusted code you'd like to run, and
  • Even if you have the source code, it may not be compilable with the Java compiler. With the recent proliferation of high-quality JVM-compatible languages, another language may express an algorithm better than Java itself, and you'd want to give people the freedom to submit untrusted code written in the language of their choice.
It may be that there's some straightforward way to put the pieces of the JSR 308 kit into a class loader and do loadtime verification of Javari's immutability rules, but I haven't yet spotted anything that says this is the case. Guess I've got my summer reading assignment.

Monday, July 20, 2009

Cooperating Untrusted Code in Java and LSL

I've been spending a fair amount of time in Second Life recently, and so I've had a chance to play with LSL, the Linden Scripting Language, which lets you program the behavior of objects in the world of Second Life. That means I've had the chance to look at LSL's solution to a problem I've given some thought to: how do you invite people to contribute code to a common environment, without worrying too much that malicious (or incompetent) contributors will hijack or crash the system?

This is an old question, and there are some well-established answers to it. From the beginning, the Java programming language has included a sandbox security model suitable, for example, for running applets in a browser. The default security level for an applet prevents most possible types of misbehavior—for example, it prohibits access to the local file system—but it doesn't have any provision for different applets to interact directly with each other; the security restrictions more or less make different applets running in the same process invisible to one another (although they could, for example, communicate with a common server). The Java security model also does not have provisions for restricting either the CPU time or memory consumption of untrusted code, and so does not deal with denial-of-service attacks that overuse those resources.

LSL, by contrast, includes explicit CPU time and memory restrictions on running scripts. Like Java, it doesn't let you access any of the underlying system (like local files)—but it does this not by having a security manager that explicitly prevents such operations, but just by omitting the operations from the language. Unlike Java, LSL doesn't also have to run general-purpose applications in a low-security environment.

And unlike Java applets, LSL does provide a fairly straightforward way for different programs to communicate with one another. Communication among running LSL scripts takes the form of text strings sent through channels, each channel denoted by a 32-bit integer. Publish the channel number and message format, and any object in the vicinity can talk with you. Objects in Second Life can interact with each other in ways independent of LSL, too. For example, objects with the physical attribute can collide with each other and push each other around.

Exciting though it is to be able to program things as visual and interactive as the content of Second Life, LSL has plenty of drawbacks. It is not a particularly full-featured language: it lacks, for example, proper arrays, which makes it impossible to build efficient higher-level data structures like hash tables. Worse, there is no way to share code across scripts; “libraries”, to the extent that such exist, are just pieces of source code that you can paste into a script.

LSL has security holes, too: for example, how do a set of cooperating LSL scripts open a private channel among themselves? Obviously, if you want to break some code that's communicating on a particular channel, and you know the channel, you can spoof a valid message, so there's a way to prevent that, right? Well, not really, no. The best you can do is set the permissions on a script that chooses a channel number so that other people can't edit the script and see the channel-choosing algorithm you used, but there's never any truly reliable way to guarantee that your channel won't collide with one used by some other object in the vicinity.

Another problem with interobject communication in LSL is speed: channels appear to be implemented with operating system sockets or some other relatively slow mechanism—certainly they don't seem to have the throughput you would expect if they were implemented as a message-delivering data structure within a single process. This means you can't treat message-sending as a sort of interobject method invocation: it's orders of magnitude too slow.

Neither Java nor LSL makes it really easy to create systems where pieces can cooperate closely with other pieces that are not fully trustworthy. What kind of implementation would be required to do that, and what kinds of things would you be able to build with it?

Friday, July 17, 2009

Who Am I? Why Am I Here?

You might recognize the title of this post as Admiral Stockdale's opening line from the 1992 vice-presidential debate. Stockdale was famously unprepared for the debate, and something in the tone of his opening rhetorical questions gave the impression that he really didn't know the answer.

It's cheap and easy to criticize someone else's mistake as a way of claiming that you're not making the same mistake. So of course I want you to believe that I do know who I am and why I'm here.

I write software, sometimes for a living, sometimes just because software is interesting and figuring out how to do something complicated that I've never done before is more satisfying than a lot of things. I mostly use Java, because I know it pretty well, and because it strikes a good balance between understandability and efficiency. I'm more interested in making big complex things run fast, and not so interested in the zillions of little workhorse programs that don't do anything particularly speedy or exotic but make the world run. I think in terms of frameworks more than applications, which is a rare luxury when you're writing software for money, which is something I'm not expecting to make from my own projects.

I'm starting this blog as a way of explaining to myself (and anyone bored enough to tune in) what I'm doing and why. There's no better way to discover that your ideas are full of holes than to try to give a coherent explanation of them.

I expect to be an inconstant poster, subject to the ebb and flow of distraction and concentration, enthusiasm and apathy.