Friday, November 9, 2007

Fun with Groovy Closures: Variable Scope

I've been exploring variable scoping rules within Groovy closures recently, trying to map concepts from other languages into Groovy. Broadly speaking, I've found four ways to reference variables from closures within Groovy 1.0.

The first, and simplest, example is referencing a variable declared within the same scope as your closure declaration. Consider the example of printing out the integer series 1 2 3 to the console. This is trivial in Groovy:
def x = 1
3.times { println x++ }
How do you do this in Java? It actually isn't that simple to do this within a closure in Java. You have to declare an instance of the Runnable interface and also use a one element array:
final int[] x = new int[]{1};
Runnable closure = new Runnable() {
public void run() {
System.out.println(x[0]++);
}
};

for (int y = 0; y < 3; y++) {
closure.run();
}
The variable x needs to be final to be referenced within an anonymous class, so you're stuck using AtomicInteger or declaring a one element array, as the example shows. The for loop syntax leaves a little to be desired too... The language syntax created a lot of noise to filter out, just to do something that is fairly trivial.

A more complex example is referencing a variable declared within the scope of the closure, that variable effectively having the same scope and lifecycle as that of the closure. Again, the Groovy example prints out 1 2 3:

def closure = {
def x = 1
return { println x++ }
}
3.times closure()
In Java:
Runnable closure = new Runnable() {
int x = 1;
public void run() {
System.out.println(x++);
}
};

for (int y = 0; y < 3; y++) {
closure.run();
}
Well, at least we didn't have to create the final one element array! One note about the Groovy version is that, in this case, the closure needs to be defined as a variable outside of the for loop. I have not been able to figure out why this is needed, but it is still a preferable syntax to the Java version simply for its brevity.

A third example is referencing a variable from a closure that is supplied at runtime by the enclosing class. Typically, in Java, this requires declaring a method that takes a parameter, but in Groovy it is much simpler:

class Utility {
def x = 1

def run = { closure ->
closure.delegate = this
3.times closure
}
}
new Utility().run { println delegate.x++ }
Most of this code is just defining a Utility class that actually executes the closure. And for those not too familiar, the "def run" statement is just shorthand for creating a method. The closure itself prints and increments the x variable that is supplied by the Utility class. How is this done in Java? Good Lord, you don't want to know.

//create utility class to execute closure and provide variable
class Utility {
int x = 1;

void run(MyInterface closure) {
closure.run(this);
}
}

//create interface that allows callback to utility's variable
interface MyInterface {
void run(Utility utility);
}

//create a closure that prints and increments utility variable
MyInterface closure = new MyInterface() {
public void run(Utility utility) {
System.out.println(utility.x);
utility.x++;
}
};

//execute closure 3 times
Utility utility = new Utility();
for (int y = 0; y < 3; y++) {
utility.run(closure);
}
I told you that you didn't want to know. In Java this is so complex that I felt obligated to add comments to the example code! And in real, production code, you probably don't want to allow public access to the x variable, so a getter and setter would be needed, adding more complexity. Just to break this down... the Utility class is defined to hold the x variable and execute the closure it is given, providing a reference to itself so the variable can be manipulated. The MyInterface interface is needed to satisfy the type system. And the MyInterface implementation needs to print and increment the variable. Wow, that's a lot of code compared to the 8 lines of Groovy.

So far Groovy is comparing well to other languages such as Scheme or Lisp. But there is one scenario I would like to use that doesn't quite work in Groovy... I'd like to reference a variable that is defined locally within the enclosing instance. Ideally, the code would look something like this:

//this does now work b/c x is not found at runtime!
class Utility {
def run = { closure ->
def x = 1
3.times closure
}
}
new Utility().run { println x++ }
The x variable is defined locally, so the closure that prints and increments the value does not have access to it. One option is to pass the x variable to the closure as a parameter, like so,
class Utility {
def run = { closure ->
def x = 1
3.times { closure(x) }
}
}
new Utility().run { x -> println x++ }
but this doesn't actually do what I want... sure the closure gets called 3 times, but x is passed to it by reference, so incrementing it has no effect on the actual x variable defined in the enclosing instance. This prints out 1 1 1. Now, in a lot of uses this would be OK because you're passing a mutable object. But for the sake of argument, it strictly doesn't seem possible to obtain the value variable x, only a copy.

A few suggestions were thrown out on the groovy.mn mailing list, and they all boil down to doing something similar to declaring a one element array like we had to do in Java. Of course, Groovy makes it so easy to create a name/value map that you might as well do that:

class Utility {
def run = { closure ->
def m = [x:1]
3.times { closure(m) }
}
}
new Utility().run {m -> println m.x++ }
Another option, which is more verbose (but still an option!), is to set your closure delegate to a reference to the map instead of passing the map as a parameter. I prefer the previous approach because it doesn't use a delegate. Newcomers to Groovy are going to learn about closure parameters long before closure delegates, so the code is arguable simpler. Anyhow, here is how you would do it:
class Utility {
def run = { closure ->
def closureDelegate = [x:1]
closure.delegate = closureDelegate
3.times closure
}
}
new Utility().run { println delegate.x++ }
A final option would be to curry the original closure to get 1 2 3 to print. I won't try to explain function currying here, not because I don't have enough space, but because I just don't understand it. It doesn't exactly allow me to declare x as a local variable within the enclosing scope, but it is darn cool nonetheless:
class Utility {
def run = { x, closure ->
3.times { closure.call(x++)}
}
}
def myRun = new Utility().run.curry(1)
myRun{ x -> println x }
Overall, I've been superbly satisfied with Groovy, and if you're a Java developer then you owe it to yourself to go out and get it. There are some really great ideas bundled into the language.

A big thank you to Zan Thrash and Scott Vlaminck for their help and suggestions on the groovy.mn discussion board!

3 comments:

Robert Fischer said...

In Java, you can reference a variable from within an inner class if the variable is declared "final" (which, of course, you should be doing 95%+ of the time, anyway*).

* See: http://enfranchisedmind.com/blog/2007/01/11/yet-another-reason-final-is-your-friend/
http://enfranchisedmind.com/blog/2006/02/08/object-burn-is-your-friend/

Robert Fischer said...

buttsmcgeee? WTF? My sister-in-law must have gotten in and screwed with my blogger account...

Happy coder said...

For the final workaround(by using curry), in fact, curry is NOT the key to solving the issue:
class Utility5
{
def run = { closure ->
def x = 1
3.times {closure.call(x++)}
}
}
new Utility5().run {println it}