Friday, May 9, 2008

Functional Calisthenics

An overview of an essay from The ThoughtWorks Anthology caused quite a hoo-ha recently. (That's a cross between a brouhaha and a hoopla, see the Reddit comments and this great response).

Regardless of how you feel about the original article, it's safe to say that practicing the basics of a newly learned skill will improve your ability at it. And you should probably practice with some principles in mind or you might just be perfecting bad habits. As my guitar teacher used to say, "You can learn a bad habit in a week that takes you two weeks to unlearn." I still suck at guitar, but it seemed like sage advice when I was 12.

I'm not trying to learn OO programming, I get enough of that at my day job. But I am trying to learn functional programming. What would functional calisthenics look like? If you're trying to learn functional programming, what are some of the basic principles to try to apply? Here's what I'm trying:

Pick an interesting but small problem to work on. Mathematical problems seem to work well, like these. Problems tied to databases and filesystems don't seem to work as well.

Pick a language that supports first class functions and recursion (to some extent, at least).

And here is my list of principles to follow... please feel free to make your own suggestions:

  • Immutable Objects - Don't allow an object's state to change after it is created.
  • No State - Don't track state, either within an object or globally.
  • No side effects - Don't allow functions to perform side effects.
  • Favor Recursion - Favor recursive algorithms over iterative ones.
  • Favor Higher Order Functions - Favor passing collaborating functions as parameters, returning functions from other functions, or composing new functions out of existing ones.
I have no idea if these principles are really the ones you want to follow, so I figured I'd try this out myself.

My problem: Maximum Segment Sum (MSS). Take a list of integers; the MSS is the maximum of the sums of any number of adjacent integer. I stole the problem from a cool paper I found when trying to figure out what Squiggol meant. For this list of integers:

31 -41 59 26 -53 58 97 -93 -23 84

the MSS is 187. That is the sum of the segment that omits the first two and last three elements of the list.

The language: I picked Groovy because that is my pet right now, but be warned that closure recursion within a script doesn't work well. I ended up having to define a Class to contain my functions. Meh.

The solution:

mss = max º sum* º segs

Simple right? Maybe not. Let's break it down. To solve this we need to generate a list of all possible adjacent values from the integer list, which we'll call "all possible segments", or just segs. Then we need to sum each of those segments, and then we need to find the highest value among those sums. So this:

mss = max º sum* º segs

reading left to right is "the mss is the max of the sum function mapped across the segments". The * is the map function, which in Groovy is Collection.collect(Closure). Call sum on each segment and take the max. The tough part of this is producing all possible segments from the integer list.

The keys to producing the segments are the functions inits() and tails(). inits() returns all the initial segments of a list in order of increasing length:
inits.[a1, a2, ..., an] = [[a1], [a1, a2], ..., [a1, a2, ..., an]]
or, in Groovy
inits([1, 2, 3]) == [[1], [1,2], [1,2,3]]
Tails() returns all the trailing segments of a list in order of decreasing length:
tails.[a1, a2, ..., an] = [[a1, a2, ..., an], [a2, ...an], ..., [an]]
or, in Groovy
tails([1, 2, 3]) == [[1,2,3], [2,3], [3]]
So if you want to produce all the possible segments of the list, all you need to do is map the tails function across the result of the inits function, flattening the result:

segs = flatten º tails* º inits

Let's get Groovy.

At the highest level, the solution is easily expressed: mss = max º sum* º segs
def solve = { list ->
segs(list).collect { it.sum() }.max()
}
assert 187 == solve([31,-41,59,26,-53,58,97,-93,-23,84]
Generate a list of all possible segments from the initial list, loop over each sub-list summing the list contents, and then take the max value.

segs is also easily expressed: segs = flatten º tails* º inits
def segs = { list ->
flatten(inits(list).collect { tails(it) })
}
assert segs([1,2,3]) == [[1], [1, 2], [2], [1, 2, 3], [2, 3], [3]]
Take the inits of the list, and loop across each entry collecting the tails of the entry. Then flatten the whole thing.

inits was simple:
def inits = { list ->
(0..list.size()-1).collect {
list[0..it]
}
}
assert inits([1,2,3]) == [[1], [1, 2], [1, 2, 3]]
This is going to loop once for every element in the list, returning a sub-list from element 0 to 'count' on each loop.

Tails has a very similar implementation:
def tails = { list ->
(0..list.size()-1).collect {
list[it..list.size()-1]
}
}
assert tails([1,2,3]) == [[1,2,3],[2,3],[3]]
This will loop once for every element, returning a sub-list from element 'count' to the end on each loop.

Flatten is where things get interesting. Collection.flatten() does us no good, because we want to reduce the result into a list of lists one element deep, not just flatten the lists into integers. My first iteration used inject() to accumulate the sum, but that seemed to much like tracking state. My interim solution was a function that just walked a nested list, flattening to 1 level. I got this working but realized it could be abstracted more by creating a higher order function.

What I needed first was a function that determined whether an element in a list was a list of primitives (in which case it was a leaf in my tree) or whether it was a list of more lists (in which case it needed flattening). This wasn't too hard:
def isPrimitiveList = {
if (!it) return true
if (!(it instanceof Collection)) return false
if (it.head() instanceof Collection) {
if (it.head().head() instanceof Collection) return false
}
true
}
assert isPrimitiveList([[1], [2]])
assert false == isPrimitiveList([[[1]], [[2]]])
If the parameter is empty or null, simply return true. (Remember, Groovy considers an empty list to evaluate to false). If the parameter isn't a collection then return false too. Otherwise, return true if the head of the head is not a collection (it is a list, but the first element is a primitive).

Then we just define a generic traversal function that will take isPrimitiveList as a parameter, stopping the traversal if that function evaluates to true when passed the current node:
def traverse = {isLeaf, collection ->
if (isLeaf(collection)) return collection
return traverse(isLeaf,
traverse(isLeaf, collection.head()) +
traverse(isLeaf, collection.tail()))
}
If the element is a leaf already, then just return the element. Otherwise, return the results of traversing the head plus the results of traversing the tail. Here you see recursion (the function calls itself) and a higher order function (the function accepts another function (isLeaf) as a parameter). From here it is simple to define a flatten function:
def flatten = traverse.curry(isPrimitiveList)
assert flatten([[1],[2],[3]]) == [[1], [2], [3]]
assert flatten([[[1]],[[2]],[[3]]]) == [[1], [2], [3]]
Curry is the worst kept secret of Groovy. If you're not familiar, it simply wraps an existing closure in another one, binding the first parameter to the passed argument. flatten() is now a traverse() function where the isLeaf closure is always the isPrimitiveList function. There's more to curry than that but it's not worth rehashing.

And there you have it, an MSS solver with only immutable objects, no state, no side effects, recursion, and a higher order function. Took me about 6 hours. Without a doubt, I definitely know the Groovy Collection functions a lot better. It was also interesting to see how my first revision was much more imperative, and over time it was whittled down to something much more functional. I felt this exercise, given the constraints stated, was worthwhile. So whether you hate the idea of functional/object calisthenics or not, this helped me. Does anyone out there have other functional principles that should be included in future exercises?

Full source (and tests) available for download.

3 comments:

salient1 said...

I think a key requirement for this would be to use a functional language (like Scala) rather than twisting Groovy into it. Don't get me wrong...I'm a Groovy fan too but I think using a functional language is far more useful in getting you into the right mindset for this sort of thing.

On a side note, the OO calisthenics mentioned at the top of your post are a good starting point for people. The rules are harsh and shouldn't be used for real-world development but they are a very useful exercise. The truth is that most people still do not understand OOA&D in the Java community. Even the ones that think they do.

Hamlet D'Arcy said...

I absolutely 100% agree with the above comment: use a functional language for this, don't try to twist Groovy into doing it. Since this post I've been working in F# and so many of the ideas and constructs you need are simply part of the language and libraries. I didn't realize how much time I was spending twisting Groovy around rather than just trying to practice FP.

F# has been really cool and runs quite easily on Linux/Mono.

paulk_asert said...

Just closing the trail for those finding this post by search and not having time to search forward where better solutions are shown.

Even though the currying example in this post is nice, it isn't needed as Groovy's sum() operator effectively does "flatten once" for lists. And the spread dot operator reads much nicer than collect once you are used to it for some of the example code shown too.