Friday, February 8, 2008

Continuation Passing Style Explained I: Java 5

Update: Like many people, I've been grappling with trying to understand continuations. In these posts, I tried to reduce continuations to their absolute essence. But in doing so, I may have missed the mark widely. I might have reduced and twisted them to the point where they aren't continuations at all, but something else entirely. As the joke goes, "Why isn't an elephant small, round, and white? Because then it would be an aspirin." Maybe all that is left of this post is a strange clever trick/ugly hack. Sorry if I've added to your confusion and misinformation... it was fun to write.

Learning a new language opens your mind to new ways of solving old problems, and so I found myself staring at my Java IDE this afternoon thinking, "What I really need is continuations and recursion." To fully convince myself I came home and spent Friday night writing a continuation based solution to my problem four different times: in Java 4 (no generics), in Java 5 (with generics), in Java 7 (using Neal Gafter's closures prototype), and in Groovy.

In this post I'll try to explain continuation passing style using Java 4 and Java 5 examples. Part II will show how the code changes with the latest closures prototype code, and Part III demonstrates the idea in Groovy.

I run across the scenario that made me think of continuation passing style fairly frequently. It's sort of a variation on the "Hole in the middle" problem described over on Enfranchised Mind. But it's really more of a "Function at the end" approach. I had reams of boilerplate foreach-loop code and just a few lines different right in the inner most loop. A simple, scaled back example is to consider writing a few functions to operator on numeric List data structures, where you need to write an exists( ) method to check for the presence of an element and an evens( ) method to filter all the odd numbers out. Your first revision will probably look something like this:

boolean exists(Integer needle, List haystack) {
for (Object element : haystack) {
if (needle.equals(element)) return true;
}
return false; // nothing found
}

List evens(List numbers) {
List result = new ArrayList();
for (Object element : numbers) {
Integer x = (Integer)element;
if ((x % 2) == 0) result.add(x);
}
return result;
}

You only have to write a couple of these functions before you realize that the list iteration methods are repeated across all of them. And the Template Method and Strategy Pattern don't exactly help here because it's hard to replicate the return statements within the functions. Some iterations stop after a few loops while others iterate over every element. Others might even perform an action on every other element. You'd never write all these loops into the language (foreach, for-first, for-every-other, for-multiples-of-3, etc.)

Enter continuations. A continuation is a function that represents the "rest of the computation given a point in the computation". Naively put, break an algorithm into two parts, have the first part call the second part in the return statement, and then the second part is your continuation. Perhaps you want to count all the files in a set of directories. You might create one function that finds all your directories ( f( ) ) and another function that counts files in a directory ( g( ) ). If you pass g( ) to f( ) as a parameter, and you code f( ) to do it's work and then call g( ), then you have composed a new function and g( ) is the continuation. Most examples you'll find on the Net have to do with tree traversal or web frameworks. The Little Schemer explains it much more simply, and I'm personally sticking with the simpler explanation. Now, full continuations are more complex than this, and I've actually just described a "continuation passing style" and not all continuations. True continuations carry a lot of rules about stack frames and variable scoping that I'm not going to touch.

Now, hopefully your language allows you to easily assign the continuation to a variable and pass it to methods, but perhaps not. You might be stuck with Anonymous Inner Classes, but that's alright. We can piece together a simple version of the evens( ) and exists( ) methods using continuations to see how they work.

The first thing to do is declare a function type:
interface Continuation {
Object call(Integer first, List rest);
}

The argument list may seem odd, but I'll explain. I've been taught to think of a foreach-loop as doing something for each element of a list. Another (better?) way to think of a foreach-loop is doing something for the first element of a list, and then repeating that something on the rest of the list until there are no more elements. That's just recursion. So if we want to emulate a foreach-loop, all we really need is a findFirst( ) method and a continuation that can do something for the first element and then call findFirst( ) again on the remainder of the list.
Object findFirst(List numbers, Continuation continuation) {
if (numbers.isEmpty()) return continuation.call(null, numbers);

Integer first = (Integer)numbers.get(0);
return continuation.call(first, numbers.subList(1, numbers.size()));
}

Basically, if the findFirst method is given an empty list, then our continuation is going to get called with null, signaling the end of the computation. Otherwise, we're going to pop the first element off the head of the list, and hand that element to the continuation along with the rest of the list. In this way you can implement an exists( ) method like so:
Boolean exists(final Integer needle, List haystack) {
return (Boolean) findFirst(haystack, new Continuation(){
public Object call(Integer first, List rest) {
if (first == null) return false; //never found
if (first.equals(needle)) return true; //match!
return exists(needle, rest); //process rest of list
}
});
}

The exists( ) method is looking for a needle (Integer) in a haystack (List). It uses the findFirst method to find the first element in haystack and passes it a continuation which knows how to proceed with the computation. In this case, if the continuation sees that a null is passed then it just returns false because no matching element was ever found. Otherwise, if the element is a match the continuation returns true, control is passed back to findFirst, back out to exists( ), and a false is received on the calling code. Finally, if there is no match, exists( ) is recursively called, but this time the list does not contain the first element.

The evens( ) method is similarly implemented in terms of the findFirst method:
List evens(final List numbers) {
return (List) findFirst(numbers, new Continuation(){
public Object call(Integer first, List rest) {
if (first == null) return Collections.emptyList(); //no more elements
List result = new ArrayList();
if (first % 2 == 0) result.add(first); //match!
result.addAll(evens(rest));
return result; //process rest of list
}
});
}
The continuation looks at the first element, if it is even then it adds the element to it's result set. Then the continuation recursively calls evens( ) on the rest of the list. In the end, you're left with a list of even numbers. The important distinction between the exists( ) and evens( ) methods is the way that both contain unique flow control. Being able to modify how your code returns or accumulates values is an important benefit of using continuation passing style in this way. Also being able to abstract out the control flow allows you to reuse that flow in different contexts. It would be difficult to use the Template or Strategy pattern and have one strategy stop the computation returning a boolean and another continue the computation accumulating Integers without repeating the code the iterates over lists. This example creates more code than it removes and introduces a functional style that is foreign to most Java code bases. Both good reasons not to use it. Other languages are different. Part II covers how the Java 7 prototype will improve this pattern, and Part III shows how Groovy would simplify it. But before moving on, let's consider how Java 5 Generics will make this code look.

I, for one, prefer this pattern with generics. Mostly because it makes me feel smarter than those poor suckers stuck on Java 1.4. We all have our own motives in life.
interface Continuation<T, U> {
U call(T first, List<T> rest);
}

<T, U> U findFirst(List<T> list, Continuation<T, U> continuation) {
if (list.isEmpty()) return continuation.call(null, list);
T first = list.get(0);
return continuation.call(first, list.subList(1, list.size()));
}

The Continuation is declared as a function which takes a T and a List of T's, and returns a U. Think forward and see how exists( ) will take an Integer and a List of Integers, and return a Boolean. Evens( ) will take the same parameters but return a List of Integers.

The implementation of findFirst is really no different, except that it now scares off novice Java developers. Its parameters are a list of T's and a continuation that accepts T's and returns U's. The benefit here is that Continuation interface can now be used type safely in many more scenarios than the previous version.

As far as the implementations of exists( ) and evens( ) goes, they really don't change at all except to pick up some more angle brackets:
Boolean exists(final Integer needle, List<Integer> haystack) {
return findFirst(haystack, new Continuation<Integer, Boolean>(){
public Boolean call(Integer first, List<Integer> rest) {
if (first == null) return false; //never found
if (first.equals(needle)) return true; //match!
return exists(needle, rest); //process rest of list
}
});
}

List<Integer> evens(final List<Integer> numbers) {
return findFirst(numbers, new Continuation<Integer, List<Integer>>(){
public List<Integer> call(Integer first, List<Integer> rest) {
if (first == null) return Collections.emptyList(); //no more elements
List<Integer> result = new ArrayList<Integer>();
if (first % 2 == 0) result.add(first); //match!
result.addAll(evens(rest));
return result; //process rest of list
}
});
}
I love generics, but I'm at a bit of a loss trying to explain how this version is better. Maybe there is a message in that. And maybe the Java 7 version will fare better.

Next up is the Java 7 Prototype version of this, along with the Groovy version.

Full executable source is available at: http://www.hamletdarcy.com/files/2008/continuations.zip (requires Junit 4 to run). Have fun!

5 comments:

murphee said...

I have the slight suspicion that you're confusing the Continuation concept with something else... I'm not sure _what_ you're confusing it with, might be Closures or Lambda functions.

As far as I can tell, you're using a whole lot of recursion to do the work. But with continuation style you can store a particular function activation (a "stack frame"), return from that and reactivate this particular activation again. (Well... more or less).

What you want, to implement the evens or exists, is simply a way to call lambdas and a set of each/select/find/fold/... implementations that iterate over data structures. Eg. in Ruby, your evens would be this:
a = [1,2,3,4,5,6,7]
a.select{|x| x % 2 == 0}

Ie. you simply pass a Block (a Lambda function) to select, which uses it as a predicate, ie. it looks at every item in the list and returns a list of all those for which the predicate returns true.

Mind you: I'm not sure if you're just confusing Continuations for something else or have stumbled upon a clever way of implementing a special case of Continuations in Java. I doubt it's the latter... but ... well, how about you read this;
http://www.intertwingly.net/blog/2005/04/13/Continuations-for-Curmudgeons
and see if what you're doing has something to do with Continuations...

Hamlet D'Arcy said...

Thanks for the comment. That is a great link.

Chapter 8 of "The Little Schemer" walks you through a similar exercise in building an "evens-only" lambda. They pass a lambda that collects the results and name it col, which short for collector. The text states the "A collector is sometimes called a continuation".

My thought was to try to separate out all the stack-frame and scoping stuff from the core of what a continuation really is, namely "the rest of the computation given a point in the computation."

Perhaps my misunderstanding is in how I read "a collector is sometimes called a continuation". I read that to mean the two words were synonyms... but perhaps it means only certain types of collectors are continuations. In which case I've just written about Collectors and not Continuations, and I should change the post to reflect that.

Does that make sense?

Hamlet D'Arcy said...

I'm updating this post to change "continuation" to "continuation passing style".
http://en.wikipedia.org/wiki/Continuation-passing_style
It seems more accurate.

murphee said...

I still don't see the connection... you're not passing a continuation (as in: a function activation, a "stack frame"), but some list data structure. Continuations and Continuation Passing is mostly just a way to formalize Goto, nothing else.

You can redefine "continuation passing" if you like, but that won't help anyone because it's just confusing. The code you provide is just a very complicated, not scalable (runs out of stack space) version of simple collector code.

Hamlet D'Arcy said...

Update: Like many people, I've been grappling with trying to understand continuations. In these posts, I tried to reduce continuations to their absolute essence. But in doing so, I may have missed the mark widely. I might have reduced and twisted them to the point where they aren't continuations at all, but something else entirely. As the joke goes, "Why isn't an elephant small, round, and white? Because then it would be an aspirin." Maybe all that is left of this post is a strange clever trick/ugly hack. Sorry if I've added to your confusion and misinformation... it was fun to write.