Saturday, April 21, 2012

Intro to Functional Programming (with examples in Java) - Part 4

Immutable data structures

In the past few posts, we've seen some interesting things that can be done with first-class and higher-order functions. However, I've glossed over the fact that functional programming also tends to rely on parameters being immutable. For example, in the previous post, memoization relies on an underlying HashMap, where the keys are function arguments. We assumed that the arguments implemented reasonable hashCode and equals methods, but things fall apart if those function arguments can be modified after they've been memoized. (In particular, their hashCode method should return a new value, but the existing memoization HashMap has already placed them in a bucket based on their previous hashCode.) Since functional programming dictates that the same function called with the same arguments should always return the same value, one of the easiest ways to guarantee that is to ensure that inputs and outputs are immutable.

Immutability also makes it easier to guarantee thread safety. If a given value is immutable, then any thread can access it, and read from it, without fearing that another thread may modify it (since it cannot be modified).

In some of my first few posts, I described an implementation of immutable binary trees. In this post, we're going to look at some simpler immutable data structures, namely ImmutableLists and Options.

Immutable Lists

For lists, we're going to implement a singly linked list structure that supports only two "modification" operations, prepend and tail. These add a new element at the head of the list, and return the list without the head element, respectively. These are both O(1) operations, since prepend simply creates a new head node whose tail is the previous list, while tail returns the list pointed to by the tail of the head node. Neither operation modifies the existing list, but rather returns "new" list reference (where prepend creates a new list reference, and tail returns a list reference that already exists). To extract that actual elements from the list, we'll have a single operation, head that returns the element at the front of the list. Readers familiar with Lisp should recognize head and tail as car and cdr. For convenient list traversal, we'll use the familiar Java Iterable interface, so we can use the friendly Java 1.5 "foreach" syntax. Also, for convenience, we'll define an isEmpty method.

As with the immutable binary trees, we're going to take advantage of immutability to assert that there is only one empty list (regardless of type arguments), and create a specialized singleton subclass for it, called EmptyList. Everything else will be a NonEmptyList. Here is the code:

That's not terribly complicated, but also a little unpleasant to use. In particular, the only way we have of creating a list is via the nil() factory method, which returns an empty list, followed by a series of prepend() operations. Let's add a factory method that takes a varargs parameter to produce a list. Since we only have the prepend operation at our disposal, we'll need to iterate through the varargs array backwards:

Let's create a quick test to confirm that the list factory method and list iteration actually work:

Now, since it's a frequently used operation, let's add a size method. The size of a list can be defined recursively, as follows: the size of the empty list is 0, the size of a non-empty list is the size of its tail plus 1. As a first try, we'll define size exactly that way:

That's nice and simple. Let's create a unit test for it:

Uh-oh -- at least on my machine, that test triggered a StackOverflowError. If it doesn't trigger a StackOverflowError on your machine (which should be unlikely -- the Java stack usually overflows around 50k calls), try increasing the value of the size local variable in the test. The problem is that our recursive definition is triggering 100k recursive calls to size(). But "Michael," I hear you say, "aren't recursive calls a fundamental keystone of functional programming?" Well, yes, but even in Scheme, Haskell, or Erlang, I believe this particular implementation would blow the stack. For those languages, the problem is that our method is not tail-recursive. A tail-recursive method/function is one where any recursive call is the last operation that executes. Since we add 1 to the result of the recursive call to size, that addition is the last operation to execute. Many functional languages implement what's called tail-call optimization, which basically turns tail-calls into a goto that returns to the beginning of the method. To clarify, let's first implement a tail-recursive version of size:

In this case, we've implemented a helper method that keeps a so-called accumulator, that keeps track of the size up to now, in the recursive call stack. The recursive call to sizeHelper is simply modifying the input parameters, and jumping back to the beginning of the method (which is why tail-call optimization is so easy to implement). Unfortunately, Java compilers do not traditionally implement tail-call optimization, so this code will still cause a StackOverflowError on our unit test. Instead, we can simulate tail-call optimization by optimizing by hand:

If Java did support tail-call optimization, it would produce roughly the same bytecode as the code above. The code above passes the unit test with flying colours. That said, because Java does not support tail-call optimization, we are probably better off inlining sizeHelper to produce the following:

Note that Scala, a JVM language, would actually optimize the previous tail-call (by producing the goto bytecode to return to the start of the helper method), so we wouldn't have to optimize it into a while loop. That said, if you come from an object-oriented background, the optimized version may actually be more intuitive. Where Scala is not able to match "traditional" functional programming languages is with "mutual tail-calls", where function a tail-calls function b, which tail-calls function a. I believe you could implement this by hand in C by longjmping to the beginning of the body of the other function/procedure, but my understanding is that the JVM's builtin security requirements prevent you from gotoing to code outside the current method. Basically, the goto bytecode is only designed to make control statements (if, while, for, switch, etc.) work within the current method, I think. If I recall correctly, Clojure (another JVM language) is able to produce optimized mutual tail-calls if you use "trampolines", but I have no idea how those work at the bytecode level.

For another example of a function that we can implement with tail-recursion (and would be implemented with tail-recursion in languages that lack while loops), consider reversing a list. When we reverse a list, the first element becomes the last element, then we prepend the second element to that first element, then prepend the third element to that, etc. As a recursive function, we pass in the original list and an empty list as the initial accumulator value. We prepend the tail to the accumulator, and recurse with the tail of the original list and the new accumulator value:

As with size, this implementation of reverse will overflow the stack on large lists. Also, like with size, we can perform the tail-call optimization by hand in almost exactly the same way:

Below, we will make extensive use of reverse, and use while loops to implement several other methods that would traditionally (in the functional world) be written with tail-recursive functions.

Anyway, let's move on to an even simpler collection type, which essentially makes null obsolete.

Option

The Option type is effectively a collection of size zero or one, with the semantics of "maybe" having a value. (In fact, the Haskell equivalent of Option is the Maybe monad. I personally like the name Maybe better, since the idea of a function that returns Maybe int nicely expresses that it may return an int, but might not. That said, I've decided to go with Scala's naming in this case, and call it Option.)

Why would we want to return or keep an Option value instead of just returning or keeping an object reference? We do this partly to explicitly document the fact that a method may not return a valid value, or that a field may not be initialized yet. In traditional Java, this is accomplished by returning null or storing a null object reference. Unfortunately, any object reference could be null, so this leads to the common situation of developers adding defensive null-checks everywhere, including on return values from methods that will never return null (since they are guaranteed to produce a valid result in non-exceptional circumstances). With Option, you're able to explicitly say "This method may not have a defined return value for all inputs." Assuming you and your team settle on a coding standard where you never return null from a method, you should be able to guarantee that a method that returns String returns a well-defined (not null) String, while a method that returns Option<String> might not.

As with our other immutable collections, we'll follow the same pattern: abstract base class (in this case Option), with a singleton subclass for the empty collection (called None) and a subclass for non-empty collections (called Some). As with immutable lists, there will be no public constructors, but instances will be made available through factory methods on the base Option class.

Now, if we have a method that returns an Option, we can call isDefined() before calling get() to avoid having an exception thrown. Of course, this is only marginally better than just using null pointers, since the Option warns us that we should call isDefined(), but we're still using the same clunky system of checks. It would be nicer if we could just say "Do something to the value held in this Option, if it is a Some, or propagate the None otherwise", or simply "Do something with a Some, and nothing with a None". Fortunately, we can do both of these. Let's do the second one first, by making Option iterable:

Here is a test showing how we can iterate over Options to execute code only if there is an actual value:

Using this idiom, we can wrap the None check and the extraction of the value into the for statement. It's a little more elegant than writing if (a.isDefined()) a.get(), and ties in nicely with the idea of thinking of Option as a collection of size 0 or 1.

We can also replace a common Java null idiom, where we use a default value if a variable is null, using the getOrElse method:

Here is a test that shows how that works (and verifies the standard get() behaviour):

Before getting into some other collection operations that we'll add to ImmutableList and Option, let's add a convenient factory method that wraps an object reference in Some or returns None, depending on whether the reference is null.

And the test:

Higher-order Collection Methods

In the second post in this series, we saw the map and fold higher-order functions. Since these require a List parameter anyway (and, specifically, according to my previous implementation, one of those nasty mutable java.util.Lists), for convenience, we can add map and fold operations to our immutable collections as methods. While we're at it, we'll add two more higher-order functions, filter and flatMap. Since all of our collections should support for-each syntax, we'll specify the contract for our methods as an interface extending Iterable.

We'll implement these methods on ImmutableList and Option. Let's look at map first:

To establish that these map methods word, let's try mapping a function that doubles integers:

Next, we'll move on to the two folds. For an ImmutableList, the foldLeft operation traverses the list in the natural order, while a foldRight is easier if we reverse the list first. For Option, since there is (at most) one element, the only difference between the folds is the order in which parameters are passed to the function argument. To distinguish between foldLeft and foldRight, we need to use a non-commutative operation, so we'll use string concatenation in the tests.

The filter method returns a sub-collection, based on a predicate that gets evaluated on every element of the original collection. If the predicate returns true for a given element, that element is included in the output collection. For Option, filter will return a Some if and only if the original Option was a Some and the predicate returns true for the wrapped value. For an ImmutableList, the code for filter is similar to map, but we only copy into the accumulator the values that satisfy the predicate.

Our last higher-order method is a little more complicated. flatMap takes a function that returns collections, maps it across all elements in our collection, and then "flattens" the result down to a collection of elements. Say you have a function that takes an Integer and returns an ImmutableList. If you map that function over an ImmutableList<Integer>, the result is an ImmutableList<ImmutableList<Integer>>, or a list of lists of integers. However, if you flatMap it, you get back an ImmutableList<Integer> consisting of the concatenation of all of the lists you would get using map. Interestingly, both map and filter can be implemented in terms of flatMap. For map, we simply return wrap the return value from the passed function in single-element lists (or Some), while for filter we return a single-element list (or Some) for elements where the predicate evaluates to true and empty lists (or None) for elements where the predicate is false. Here are implementations and tests for flatMap:

A nice way of combining these two collection types is to flatMap a function that returns Option across an ImmutableList. The elements that evaluate to None simply disappear, while the Some values are fully realized (as in, you don't need to invoke get() on them).

Summary

In this post, we got to see a couple of immutable collection types.

While ImmutableList as I've implemented it takes a performance hit by using its own interface (since it tends to have to call reverse to provide consistent ordering), there are a couple of improvements that could be made. One would be to provide additional implementations of the methods that would contain all of the logic before the call to reverse, for the cases where order doesn't matter, and then implement the ordered versions by calling the unordered version followed by a call to reverse. The other would be to make NonEmptyList's tail member non-final, and make use of our private access to implement efficient appends by the higher-order methods. The code for that would be a fair bit more verbose (since we would have to hold a reference to the element preceding the terminal EmptyList). That said, it's possible that the additional operations per iteration might outweigh the cost of traversing (and creating) the list twice.

I think Option is one of my favourite types from Scala. The idea that Option gives you a warning through the type system that a value may not be initialized is fantastic. Sure, you're wrapping your object in another object (at a cost of 8 byes on a 32-bit JVM, I believe), but (with some discipline) you have some assurance that things that don't return Option will not return null. With the option() factory method, you can even wrap the return values from "old-fashioned" Java APIs. Unfortunately, I haven't yet seen a language where NullPointerExceptions are completely unavoidable (or aren't masked by treating unitialized values as some default). Since Scala works on the JVM, null still exists. Haskell has the "bottom" type (which actually corresponds more closely to Scala's Nothing), which effectively takes the place of null (as I understand it -- I'm a dabbler at best in Haskell), though the use of bottom is heavily discouraged, from what I've read.

In between these collection types, I managed to sneak in an explanation of tail-call optimization, and how it can be used to avoid blowing your stack on recursive calls on a list. Tail-calls are not restricted to recursive functions (or even mutually-recursive sets of functions), but the stack-saving nature of tail-call optimizations tend to be most apparent when recursion is involved. The other bonus of TCO (with apologies for my vague memories of some form of assembly) is that the tail-call gets turned into a JMP ("jump" or "goto"), rather than a JSR ("jump, set return"), so the final evaluation can immediately return to the last spot where a tail-call didn't occur. That said, I should remind you that tail-call optimization only works when you're delegating your return to another method/function/procedure (that is, the compiler doesn't need to push your current location onto the stack in order to return). When you're traversing a binary tree, and need to visit the left and right children via recursion, at least the first recursive call is not a tail-call. Of course, in Java, with default JVM settings, in my experience, the stack seems to overflow around 50k calls. If you have a balanced binary tree with 50k levels, then you probably have enough elements to make traversal outlast the earth being consumed by the sun, in which case a StackOverflowError is the least of your worries, since you and everyone you love will be long dead. (Actually, since the recursion is likely to be depth-first, you're lucky -- the stack will overflow quite quickly, and you can call your mom to remind her that you love her and you're glad that she didn't die while your program ran. Moms love that kind of thing.)

For some really interesting information about tail-call optimization on the JVM at the bytecode level, I suggest I you read John Rose's blog post on the subject from 2007.

I think this may be the longest post in the series so far. In particular, I've gone whole-hog on using unit tests to both establish that my code actually works (give or take copy-paste errors between my IDE and GitHub gists) and show examples of how to use these methods.

I must confess that this post was also more comfortable to write. Java doesn't make writing functions as objects terribly pleasant (since it wasn't really designed for it). I think the code in the last couple of posts involved more type arguments than actual logic. Working with data types is much easier, by comparison.

Where do we go from here?

I had been thinking about trying to implement a mechanism to combine lazy-loading of Function0 values with parallel execution, by replacing uninitialized values with Futures linked to a running thread on a ForkJoinPool. Unfortunately, the more I think about it, the more I realize that "parallelize everything" is a really bad idea (which is why nobody actually does it). I might be able to do it like memoization, where you can selectively memoize certain functions that are frequently used on a small set of inputs, so you would be able to parallelize only your "hot spot" functions. Unfortunately, my memoization implementation is kind of an ugly hack, requiring a parallel MemoizedFunctionN hierarchy, and I'm not particularly enthusiastic about creating a separate ParallelizedFunctionN hierarchy. Sure, I can write code generators for these different function specializations, but it's still not particularly elegant.

The last remaining "standard" functional programming concept I can think of that I haven't covered (besides pattern matching, which I don't think I can implement nicely in Java) is tuples. If I dedicate a post exclusively to tuples, I think it should be fairly short. After these last couple of lengthy posts, that may be a good way to go.

Saturday, March 10, 2012

Intro to Functional Programming (with examples in Java) - Part 3

Let's recap the last two entries in this series, and what we learned:
  • In the first post, we learned that we can implement functions as objects in Java and pass them as arguments to other functions or more traditional methods. Functions that can be passed around are called first-class functions. First-class functions can be implemented in C/C++ as function pointers, while languages that are more conventionally "functional", like Lisp, Haskell, Scala, or Javascript, simply treat functions as another "thing" that can be passed around. Functions that take other functions as parameters are called higher-order functions. This is similar to passing an object that implements a callback interface. Listeners in Swing are like function objects, though they may have multiple entry-points. Our function objects have a single "apply" method that basically means "call me".
  • In the second post, we learned about currying and partial application, which effectively say that any function with n parameters can be thought of as function with one parameter that returns a function with n-1 parameters. Once we hit zero parameters, we actually invoke the logic we've specified. (In this post, we're going to look at not invoking the logic right away.) We also looked at two of the more important higher-order functions, map and fold. The map function transforms a collection to another collection by applying a function of one parameter to each element in the original collection and accumulating the results in the new collection. The fold function aggregates values from a collection, producing a new single value, by applying a binary function that take the current output and a value from the original collection. In some contexts, fold is also called reduce. Thus, these two higher-order functions form the basis of so-called map-reduce algorithms, where you would typically apply the map step in parallel on multiple nodes, and aggregate the results via a fold on a single node. (Sometimes the fold can be parallelized, though it's not generally as trivial as parallelizing a map operation. As an example of a parallelizable fold, consider computing the sum of a list of integers. The sum of the whole list can be computed by partitioning into sublists and mapping the sum fold over each sublist, then folding the sum function over the resulting list of sums. Anyway, this is just a teaser for a future post on parallelizing function evaluation.)

Lazy Evaluation

In the second post, I introduced partial application, giving the example of a Function2 that returns a Function1, and explained that this can be generalized to any FunctionN returning a Function(N-1) after binding the first parameter.

Similarly, we can apply the same logic to a Function1, and have it return a Function0, instead of a result.

What is a Function0? It's a function that takes no parameters, and returns a value. Since one of the tenets of functional programming is consistency (that is, a function, given the same arguments, should always return the same value), a Function0 (taking no arguments) produces a constant value. That is, a Function0 effectively is a constant value. For example, you could define a Function0 called Five, whose apply() method returns the Integer 5. However, if a Function0 is produced from a Function1 (or a FunctionN), it doesn't evaluate that value until you ask for it. If it retains the value from the first evaluation, and returns it rather than reevaluating on subsequent calls to apply() (since every evaluation should produce the same value), then you've effectively implemented lazy evaluation.

Recall the definition of Function1, from the first post in this series:

We have a function that takes a T and returns an R, for some types T and R, via the apply method. We can change this, so instead of returning an R directly, we return a Function0<R>. That Function<R> will return the R once its apply() method is invoked.

Here is a simple definition for a Function0 with lazy evaluation: And here is a modified version of Function1, where we've renamed the previous abstract apply method to evaluate, to highlight the fact that when you call it, the code actually gets evaluated. When you simply apply the function, you get back something that can eventually be evaluated (ideally via Function0's get method).

We can also extend this to any FunctionN, by redefinining their total application methods to use partial application (taking advantage of currying to treat them as Function1s):

I've chosen to make evaluate public throughout to allow the option of strict evaluation, meaning "I want you to actually compute this now, and not store up the things that you are going to do when I eventually call get()". While lazy evaluation with functional programming kind of allows you to build up a planned execution path at runtime and then trigger it with one call to get(), sometimes you do need to fully compute something before returning from the current function.

Memoization

If you were paying close attention on the currying section of the previous post, you may have noticed that we are creating new function objects every time we do a partial function application. This same problem extends to our creation of Function0 objects above. This is expensive, since each of these function objects consumes some memory (which may be short-lived, but there is still some overhead), and it kind of defeats the purpose of lazy loading. In particular, if we're creating a new Function0 every time we finish passing all of our parameters, then we would need to retain that specific Function0 to ensure that we don't reevaluate the logic the next time we need the result.

We can extend Function1 to remember the value passed as the parameter to apply, and hold onto the resulting Function0, to return it the next time we're called with that same parameter. This process of returning the same value based on remembered arguments is known as memoization.

I am making the assumption that the parameter passed is of a type T that implements hashCode and equals in a reasonable way.

Using an instance of this MemoizedFunction1 class, we know that its logic will be evaluated exactly once for each parameter value. Of course, this comes at a cost: for each parameter value, we retain the evaluation result object in memory (wrapped in a Function0, which itself adds a few more bytes). If you have a function that will be applied over a significant number of inputs, or is rarely applied to the same value twice (e.g. string processing functions operating on user input sent to a server have a decent chance of seeing billions of different strings), then the overhead of memoization likely would not pay off. This is why I chose to create a separate class, rather than simply adding memoization to Function1 directly. You can use a regular Function1 when you don't need memoization (which is likely to be most of the time), and a MemoizedFunction1 when you do. Since they use the same API (you just override evaluate), there's no additional thinking cost./p>

Let's confirm that the memoization works with a unit test, using everyone's favourite memoization and dynamic programming example, the Fibonacci function:

I selected the expected timings to give consistent results on somewhat faster and slower machines than mine. On my machine, the unmemoized version takes around 3.5 seconds, while the memoized version takes less than 1 millisecond.

Runtime memoization and recursive memoization

The previous example shows that, when writing your code, the memoized and unmemoized versions of a function are identical, except that one is instantiated as a MemoizedFunction1 and the other just a regular Function1. What if you want to dynamically memoize a function at runtime? We can implement a utility method to "upgrade" a regular Function1 as follows:

Let's create a test for it:

Unfortunately, that test will not pass (yet). The problem is with our method call targets, and the fact that we're creating a MemoizedFunction1 and wrapping it around a regular Function1. How do the method calls work out?

What we really want is something more like this (since MemoizedFunction1's apply method is where the memoization happens):

In order to route the call back to the outer MemoizedFunction1, it would be ideal if we could override this in the inner Function1. Unfortunately, we can't do that. Really, though, object-oriented programming can be mapped to procedural programming (and, indeed, I believe this is more or less how the first implementation of C++ worked) roughly as follows:

  • Every method is converted to a procedure with an additional this parameter, which is basically like a pointer to a C struct.
  • Every method call is transformed into a procedure call where the address of the object (or rather, it's struct representation) on the left-hand side of the ., or the pointer on the left-hand side of a ->, is passed as the this parameter.
  • Unqualified calls to methods or references to fields within the current objects can be thought of as using an explicit this->.

So, while we can't redefine this, we can make Function1's evaluate method take an extra parameter which will act in place of this for recursive calls. We'll call it self.

Note that for the base Function1, self really is this, as far as apply is concerned.

Now, we can redefine our Fibonacci function to use self.apply instead of just apply, and change MemoizationUtils to pass the MemoizedFunction's this as the self parameter.

With these changes, the testRuntimeMemoization test shown earlier now passes.

Memoizing functions of N parameters

Recall that because of currying, every function of N parameters can be thought of as a Function1 that returns a function of N - 1 parameters. This means that by implementing MemoizedFunction1, we get MemoizedFunctionN for free (almost). We just need to override the way that we curry, to return a MemoizedFunction1 that returns a MemoizedFunctionN-1:

And MemoizationUtils can have convenience methods to memoize them at runtime:

Summary

This is a post that gets into some of the "meat" of functional programming. Lazy evaluation can be quite powerful, as we defer evaluation of results until they are needed. Combined with memoization, we can ensure that we never reevaluate a function with the same parameters, using the previously computed value, instead. Memoization is not something you want to use everywhere, though. It's kind of a form of optimization, and, as Knuth said, "Premature optimization is the root of all evil". In a general program, if you memoize everything, you'll quickly exhaust your available memory. That said, because the functional paradigm allows considerable separation of concerns, we don't need to build memoization into any functions. The functions themselves can focus on their logic, and memoization can be tacked on as a cross-cutting concern (even at runtime, as we showed).

Where is memoization useful? In dynamic programming problems, you can define the problem in terms of a subproblem (or a problem on lesser inputs), with an eventual base-case. If you can write a function (or a collection of functions) that encapsulate the program logic in a naive way, you can probably just stick on a call to memoize and guarantee reuse of subproblem solutions. To a constant factor, it won't be as efficient as maintaining results in arrays with a hand-coded solution, but it's so much easier to write. If you need to solve dynamic programming problems as part of your job, and not just to compete in online contests against C programmers, writing your memoization logic once and transparently attaching it where needed may be a good approach.

The more interesting thing, in my opinion, about memoization is that it demonstrates how easy it is to attach a cross-cutting or orthogonal concern to your logic (without needing to resort to aspect-oriented programming). The same way that I wrote MemoizationUtils to attach memoization to a function could be used for other purposes. For example, we could attach a tracer that collects a capacity-bounded set of parameters passed to a function and the number of times the function is called. If a function is usually called with, fewer than, say, 1000 different parameters (of reasonable size to be used as keys) but is called millions or billions of times during a typical execution of the program, that function may be a good candidate for memoization. Similarly, I believe we could simply wrap each of our functions in a tracer that keeps track of the function's summed execution time and call count in a ThreadLocal variable to implement a basic in-process profiler.

This ties in with the higher-order functions in the previous post. Whereas a typical object-oriented programmer will again and again write code to turn a list into another list or compute some sort of aggregate value from a list (whether it be a max, min, sum, or whatever), functional programming allows you to separate what you are doing (transforming/aggregating a list) from how you are doing it (the problem-specific logic).

In the next post, I think I will introduce concurrency via parallel/asynchronous evaluation. Assuming that's where I go (since I tend to get sidetracked between posts), Function0 will be very prominent, as it's effectively a simplified version of a java.util.concurrency.Future. If I don't feel like diving into concurrency, I will probably end up covering immutability and immutable data structures. Immutable data structures are more traditional (and fundamental) to functional programming, but concurrency is pretty cool. It's always an adventure guessing where I'm going to go next.

Thursday, January 12, 2012

Intro to Functional Programming (with examples in Java) - Part 2

In the previous post, I introduced first-class/higher-order functions, and gave some quick examples. We saw that it's a little tricky in Java to have higher-order functions that can generalize on type parameters, but that we can work around that with some factory method trickery. You still need to make sure that your factory methods are mapping the static type parameters to the raw/erased versions correctly, but at least you can only mess it up in one place.

In this post, we'll look at currying and partial application of functions, as well as two of the more popular higher-order functions in more detail.

Currying and Partial Application

In math, suppose you have a function f(x,y) of one variable. If you want, you can treat the two variables independently, by "fixing" the other parameter and defining a new function of one variable -- e.g. define g(x) = f(x,0) and h(y) = f(0, y).

Suppose you don't know ahead of time what fixed values you want to pick. Say you have a function f(x1, x2, x3, ..., xn) and you define g(x) = f(x, x2, x3, ..., xn). In this case, you "bind" the x variable from g to one of the variables from f, but the remaining variables would be unbound. In this case, g is not a function that returns "values" for each input x, but rather a function that returns functions of n - 1 variables. That is, to get a value, you would need to "call" the function as g(x)(x2, ..., xn) (i.e. you would first bind x, and then you would bind the remaining variables). For a fixed x, we could then define h(x2, ..., xn) = g(x)(x2, ..., xn) = f(x, x2, ..., xn). With the definition I'll use for currying, we have the following properties:

  • g is a "curried" version of f. That is, it is a function of one variable that returns functions of n-1 variables.
  • h = g(x) is a "partial application" of f. That is, the value x has been "applied" (or bound) to the variable x1, while the remaining n-1 variables remain unbound.

Getting back to our functional programming examples, invoking curry on a FunctionN returns a Function1, which in turn returns a FunctionN-1. Let's look at the somewhat messy code involved in doing this to a Function2:

This turns a function f(x,y) into the equivalent function f(x)(y), where x and y can be applied individually (but only in that order). Here is an example:

Of course, it would be nice to not have to explicitly write the curry() call when we want to bind the parameters one-by-one. (Especially since, for a Function4, for example, the call would turn into f.curry().apply(1).curry().apply(2).curry().apply(3).apply(4). In theory, I could modify my definition of curry to return a chain of Function1s, so you would only need to call it once, followed by the series of applys.) The easy way of doing this is to hide the curry call(s), within an overloaded definition of apply.

Recall that apply for a Function2<R,T1,T2> takes two arguments of type T1 and T2 respectively, and returns a value of type R. We can write an overloaded definition of apply that only takes one argument of type T1 and returns a Function1<R,T2>. Here is the code:

Using this overloaded definition of apply, we can now "partially" apply any function to yield a function that takes one less parameter. The addition example above simplifies to:

We can generalize the currying of Function2 to arbitrary FunctionN, and get partial application "for free", by making them extend a Curryable class defined as follows:

Then for FunctionN, we just need to define a curry function that returns a Function1 that returns a FunctionN-1:

A pair of useful higher-order functions

Recall from the previous post that a higher-order function takes at least one other function as a parameter. The general reason why you want to use higher-order functions is that they provide a logical separation between what you want to do and how it is going to be done. The higher-order function can define the boring/standard "how" and the passed function defines the particular "what" for your problem.

Consider the following example: We have a list of integers, and we want to double each one and return the output in a new list. The process is simple: allocate a new list, iterate over the existing list, storing two times the current value in the new list. Unfortunately, there are several different ways to implement iteration over the list, as shown in the following code:

If you take the time to run this code (or you were looking carefully), you'll find that it has a bug. In doubleListWithIteratorFor, instead of saying iter.next() * 2, I wrote iter.next() + 2. Oops!

The problem, I see it, is that the actual problem I'm trying to solve (doubling each element) has been overshadowed by the way it's being solved (iterating over each element, storing the result in a new list). The action of producing a new list (or collection) by "doing something" to each element of an existing list is very common in software development, with only the "something" varying from case to case.

In the previous post, I described a transformList higher-order function, and the tricks involved with creating one which is type-safe in its generic parameters. This function is more commonly called map in the functional programming world. For a recap and a better generalization of that code, here is the map part of my ListUtils class:

The type parameters of map are explained as follows: map takes as input a function that turns T1s into Rs and a list of T1s, and return a list of Rs. If you give me a device that can turn an apple into an orange, and a bag of apples, then I will give you a bag of oranges. In Haskell, this is written as map :: (a->b) -> [a] -> [b]. The consecutive arrows reflect the fact that all functions in Haskell are curried, so map partially-applied with a function from a to b yields a function that takes a list of a and returns a list of b. I'm introducing the Haskell function signature syntax now, so I can use it to help explain the definition of fold below.

Since the map function nicely handles the "how" of traversing a list and returning the transformation results in a new list, all I need to supply is "what" transformation I want to do, and "what" list I want to use as input. We can solve the list doubling problem from above with the following code:

The anonymous class boilerplate required by Java is a little annoying, but the only "executable" parts of that code are the map call, the return statement, and the apply invocation. In other words, the only executable pieces relate exactly to what I am trying to do (map a function that doubles its input over the list values).

Another common task in software is producing a single value from the contents of a list. This may be finding the min/max value in the list, returning the sum of the list, joining the list into a String, etc. Again, the functional programming world has a higher-order function for that, called fold.

Explaining fold in words is a little difficult. It takes three parameters (usually specified in the reverse order, but this allows me to go from "least complicated" to "most complicated"):

  • A list of input values.
  • A "seed" output value, which will be returned if the list input is empty.
  • A function that takes the current output value and an input value and produces a new output value.

Until now, I have been describing fold as one function, but it is actually a pair of functions, foldLeft and foldRight (or foldl and foldr in Haskell). There are two reasons: one is that the binary function supplied as input may expect the output and input values in either order, and another is that there are cases (especially in Haskell, due to its lazy evaluation) where evaluating the list elements from back to front may be more efficient than evaluating them from front to back. For now, we'll just consider foldLeft. In Haskell, it has the following signature: foldl :: (b -> a-> b) -> b -> [a] -> b. In words, this says:

  • Given:
    • a function that takes a b and an a (in that order), and returns a b
    • a value of type b, and
    • a list of as,
  • Return a value of type b.

The simplest (and fairly common case) for fold is when the input and output types are the same. For example, we have a list of integers and want to retrieve the maximum (integer) element. In this case, we supply a function that takes two integers and returns the bigger one (or, say, the first if they're equal), and a seed value that should correspond to the "maximum of an empty list" (maybe Integer.MIN_VALUE in Java). The fold function will iterate through the list, applying the "getBigger" function (combined with the seed or previous output) and return its last output.

Without further delay, here is the foldLeft function, in a raw form, a "factory" method and a partial-application factory:

Another simple example for fold is finding the sum of a list of integers. In this case our binary operation is addition, and our seed is zero. Here is the code used to find the maximum integer in a list, find the sum of a list, build an integer from a list of digits, and concatenate the list as a comma-separated string:

Note that all of these functions are reusable. Thanks to partial application, we can hold onto a findMax function and apply it to any list of integers to get the maximum value:

Function1<Integer, List<? extends Integer>> findMax = 
        ListUtils.foldLeft(new Function2<Integer, Integer, Integer>() {
    @Override
    public Integer apply(Integer i1, Integer i2) {
        return i1 < i2 ? i2 : i1;
    }
}).apply(Integer.MIN_VALUE);

Integer max = findMax.apply(Arrays.asList(1, 2, 3, 4));
Since this will only work for integers, it's not as generally useful as one might like. It should be possible to modify it to act on List<? extends Comparable<?>>, with the caveat that the result will also be a Comparable, which then needs to be cast to the appropriate type.

Conclusions so far

Writing this and the previous post has shown me that some fairly elaborate functional programming concepts can be implemented in Java in a way that's easy enough that even I can do it, albeit with a lot of anonymous class boilerplate. I believe that Project Lambda significantly reduces this boilerplate, bringing it closer to Scala in that regard. Someday, I'll try forking my Github source for these code snippets and rewriting it against a Project Lambda build.

The main limitations I see for functional programming in standard Java so far (beyond the boilerplate) are:

  • The assumption of mutability in the standard library collections is not ideal for functional programming (which, I suppose, is why the Functional Java team wrote their own immutable collections). I'll get into immutable data types and collections in a future post, where I talk about referential transparency.
  • The type system in Java makes it hard to write really general, reusable higher-order functions. Maybe there are tricks I could use, but I haven't been able to figure them out (besides the raw function hack I used for map and fold). Scala's type system seems to allow more general generic (higher-kinded) types, but I find the syntax involved pretty confusing. I find Haskell the best in this regard at this point. (The dynamically-typed languages also make it easy, at the cost of compile-time type guarantees.)

Come by next time, when I'll introduce tuples, memoization, and lazy evaluation.

Friday, December 9, 2011

Intro to Functional Programming (with examples in Java) - Part 1

What is functional programming?

Quoting Wikipedia, "In computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data."

As someone with a math background, I like the simple, axiomatic ideas of the lambda calculus and the fact that you can derive fairly rich constructs from simpler ones (all using a relatively simple collection of operations).

Of course, like many developers these days, my programming experience has almost entirely been in the imperative, object-oriented world. Over the past year, I've done a lot of reading on functional programming in Scala, Clojure, Haskell, and Erlang, and have been solving online puzzles in Scala. Functional programming now makes sense to me from a practical standpoint. At the same time, whenever I try to explain some cool, short functional code to colleagues, their eyes glaze over, for what seem to be the following two reasons:

  • The syntax. Haskell or Lisp (including Clojure) look really different from languages with a C heritage. While Scala can be made to look a lot like Java, doing so tends to involve using its more object-oriented flavour, which defeats the purpose of a functional programming example. In my opinion, idiomatic Scala feels more like Haskell.
  • They are also usually not terribly excited, since they could generally do the same thing with some slightly more verbose imperative code, so they don't really see the point.

In this post (and a likely followup or two), I'll try to address the first point, by building up some functional programming constructs in Java. For the second point (at a later date), I'll try to show some examples where functional programming lets you focus more on what your code is doing, and less on the details of what the computer is doing.

First-class Functions

One thing that seems to be needed for any functional programming is the idea that functions are objects that can be passed around. In Java, of course, objects are instances of classes, so let's create our first function class in Java:

This is a single-purpose function class that, given a string will return its length. We can pass this around, have it as a member, store it as a map value, etc. Unfortunately, it only does one thing. Let's make a base class that generalizes this a bit further to all functions that take a String as input and return an Integer:

This is more useful. We could imagine having a method that acts on a Collection of Strings and wants to produce an Integer for each one. We could then pass it any FunctionStringToInteger as a parameter, independent of how the Integers are produced from the Strings.

Let's generalize this further, using Java generics to handle any function of one argument. In this case, we want to generalize on the input type T1 and the return type R. (I'm using T1 instead of T, since we'll soon be looking at functions that take multiple parameters, which could each be of different type.)

While the generic parameters can be a little messy, we can now model any function of one parameter as a subclass of this Function1 class. Here's a simple example:

Of course, having the function as an object and invoking it directly doesn't really say anything about first-class functions (and mostly just looks like a much more verbose way of calling a method). Lets write a method that takes a function as a parameter, so the first-classedness becomes more useful:

Now we can call transformList with a list of elements and a function that transforms individual elements and get back a list of all of the transformed elements. This is actually a fairly common pattern, so transformList is part of the standard library in a lot of functional languages (though it's usually called "map" -- we'll switch to calling it map in part 2).

We can also define classes for functions that take more than one parameter:

That's cool, but isn't it annoying that you need to make FunctionN classes for each number of parameters that you want? Yes, but you define the classes once and then you use them forever, or ideally they're already defined in a library (e.g. Functional Java defines classes F, F2, ..., F8, and the Scala standard library defines classes Function1, ..., Function22). For this blog, we'll do things the hard way, and implement everything ourselves. (In particular, I will be adding more functionality to the base FunctionN classes in part 2.)

Higher-order Functions

Once you have first-class functions, it's not hard to imagine creating functions that take functions as parameters. Above, we defined the transformList method that took a function as one of its parameters, but methods aren't functions as we've defined them (that is, methods are not first-class in Java). Here's a silly example of a higher-order function with a very long name:

Why don't we turn transformList into a function? Then we could pass it to other functions/methods to build up more complicated things. Unfortunately, this is an area where Java makes things more of a pain for functional language programming. I think it's related to Java not having "higher-kinded" types.

The basic problem is that, while you can write a generic Function2 that has a Function1 as a type parameter, you don't necessarily know the R, T1 type parameters for that Function1. What we would really like to do would be to write a type-safe Function2 that can take any Function1 as a parameter, regardless of what that Function1's types are, and without necessarily having that Function1 when creating the Function2. Unfortunately, I don't think that's entirely possible in Java, but we're going to come up with a way of coming close.

Let's try to define a generic transformList in a naive way:

Well, that's not going to work. If you paste that code snippet into an IDE (as I just did), all of the Rs and T1s should be highlighted in red. There is no context to define what these type parameters should be in the general case. So, what can we do to get around this? One approach, if you're into taking risks, is to just drop the unbound type parameters:

That will compile, and you get the correct result if you pass the right types of parameters. Unfortunately, if you call apply with a list of Integers and a function that takes something else (e.g. String) as a parameter, you won't notice your mistake until runtime (when a ClassCastException is thrown). You will get a compiler warning regarding unchecked calls within the definition of the apply method of rawTransformList, but that is it.

If we cannot directly define a generic, statically-typed higher-order function for transformList, and a raw version is not safe, maybe we can define a generic factory method that builds these statically-typed functions by providing a "hint" of the type parameters:

This is looking better. We don't need to know exactly which Function1 will be passed to transformList, but we do need to know the input and return types for that Function1. Unfortunately, this will create a new Function2 every time we call it. We could hold onto transformList instances for every input/return type pair that we use, but that's still wasteful (especially since the type parameters will be erased by the compiler, and the stateless instances will all be identical at runtime).

I believe the best approach is to combine the above two approaches. Use a private static raw instance singleton, which is returned by a generic public factory method:

This gives you a Function2 that statically type-checks its inputs and outputs, despite the fact that it's the same one being used for all type parameters. Because of Java's compile-time type erasure, it makes no difference at runtime.

What's next?

I think this post is getting long enough. We've gone over the basics of first-class and higher-order functions. In the next posts, I'll add currying to the functions and talk about the idea of everything (conceptually) being a function.

The code for this series of blog posts is available in my public github repository.

Wednesday, March 30, 2011

Immutable Binary Trees (Part 2)

On Monday, I introduced a basic implementation of immutable binary trees in Java. At the end of that post, I wrote:

There are several problems with this implementation that I plan to address in a future post:

  • The remove method creates a new node at each level, even if the element to remove does not exist.

  • The use of null to indicate the lack of a left or right child opens us up to the potential of NullPointerExceptions and wastes space from those fields. (In particular, in a balanced binary tree, half the elements will be leaf nodes.)

  • Currently, there is no way to create an empty tree, since every node has a value. If you look at the ImmutableTreeRunner implementation above, I had to generate my first random value before I could create the ImmutableBinaryTree.



This is that future post. I'll present a much cleaner, more memory-efficient implementation that solves these problems. That said, this is still not a good binary tree implementation to use. I'm not implementing any balancing, so it's generally going to be quite slow. It's just a more elegant unusable solution.

The first problem was largely laziness on my part, I suppose. (It would have been possible to better handle remove in the old implementation.) The other two problems can be addressed by accepting that not all immutable binary trees should be created equal, and establishing a class hierarchy:



That may look overly complicated, but all of the logic will actually degenerate into handling empty trees and handling non-empty trees. The NonEmptyTree subclasses only add fields and provide one-line implementations for NonEmptyTree's abstract methods.

Note that I gave the EmptyTree rounded edges in the diagram to distinguish it -- this will be implemented as a singleton. Conceptually, there is only one empty immutable tree). Also, this makes for some simpler code, as we can use == to see if a child is the empty tree.

First, let's look at the implementation of the ImmutableBinaryTree abstract base class:



We specialize the declared return type of add and remove, while still satisfying the ImmutableSet interface I put up on Monday. We declare an internal addAll helper method (though this could also be added to the ImmutableSet interface). Finally, we create two factory methods for immutable binary trees -- one that returns the empty tree and another that constructs a series of trees until all the given elements have been added.

Next we implement all of the abstract methods for the empty tree. These are all one-liners:



We also store the (untyped) singleton EmptyTree instance and provide a (typed) static accessor for it. In this case, working without the T type is perfectly legitimate -- there is no state influenced by T in instances of this class. That said, the Java compiler apparently can't tell that this is safe (which is fair, since usually it's not safe). So, I've used the @SuppressWarnings annotations to reassure the compiler that I know what I'm doing. (Note that a similar approach is used at least in the Apache Harmony implementation of Collections.emptyList(), which is solving almost exactly the same problem.)

Before thinking about the behaviour of non-empty trees, I think it makes sense to think about their state. Firstly, all non-empty trees represent a node with a value. So, NonEmptyTree will have a field of type T for that value. Additionally, non-empty trees have 0, 1, or 2 children. If they have 1 child, it is either a left or right child. This motivates the NonEmptyTree hierarchy presented above. By isolating the state (that is, the fields) into subclasses, we ensure that memory is not wasted on fields that are unused (that is, are permanently null). It also makes it easier to reason about the state being used in each subclass.

Fortunately, by ensuring that all of the subclasses provide a consistent exposure of state, that is, they supply a left and right child on demand (where they may be the singleton EmptyTree), we can safely implement all of the logic in NonEmptyTree:



Why did I declare emptyTree local variables to capture the empty tree instance before checking equality against passed parameters? It turns out that the Java type inferenceer (or at least the one used in my local version of Eclipse on a Mac) wouldn't allow me to check e.g. tree == EmptyTree.instance(). It's possible that I messed up with the covariance of the type parameter. Regardless, adding the local variable was an easy of making my intentions clear to the type inferencer.

The general flow is as follows:

  • add is a special case of addAll, unless the current node already holds the value being added.

  • remove either removes the current node (at which point, we merge the left and right children), or replaces the relevant child with the result of the remove operation applied to it.

  • addAll either merges the newly added tree's children with this node's children (if they have the same root value), or replaces the relevant child after applying the addAll operation to it.

  • replaceChildren runs through the five relevant cases for replacing the children of a given node:

    • If both children are equal to the current value, then nothing has changed. Return this.

    • If both new children are the EmptyTree singleton (and weren't before, or they would have been caught by the previous check), then we return a new Leaf.

    • If one new child is the EmptyTree, we return a new LeftBranch or RightBranch.

    • Otherwise, both children are non-empty so we return a new DualBranch, which is guaranteed to be different from the current node (by the first check).



  • The toList and contains methods are fairly self-explanatory.


Finally, we have all of the specializations of NonEmptyTree, which are a series of constructors and one-line method implementations:











There are a couple of points that I found interesting working on this code:

  • There is not a single null check in this code. Instead, that role has been absorbed by EmptyTree, which I like to think of as a sort of "type-safe" null. Instead of checking for null and deciding how to react in every situation, I've defined a "null-like" element, and specified how it should implement my interface in a consistent way. I did still check for EmptyTree in replaceChildren, to ensure that the most efficient concrete NonEmptyTree is returned. I could have just returned a new DualBranch, and the correctness of the algorithms would still hold.

  • A Java object, as I understand it, occupies enough memory to hold a reference to its runtime class, and the space occupied by its fields. Since EmptyTree is a singleton, its memory consumption doesn't really matter, but should effectively be this single reference to its class. Leaf is a concrete class with a reference to its class and a reference to the contained value of the node. On the other extreme, DualBranch represents four references to: the class, the value, a left child, and a right child. Thanks to the immutable nature of this code, we are required to return new objects, which don't necessarily need to be of the same type as our current (runtime) class. We can make those objects as efficient as possible. If I had implemented a proper balancing strategy, half the nodes would be Leafs, and would occupy roughly half the memory of corresponding DualBranches (disregarding the space occupied by the values themselves, aside from their references).



While there's nothing new or clever in these posts, I've found them to be a good learning exercise, and have helped me reinforce some functional programming ideas. In particular, while the whole "type-safe null" idea expresseed by EmtpyTree is pretty standard fare in Haskell, it's still pretty novel to me as a (mostly) Java developer. Also, the particular refactoring in this post was motivated by the elegant simplicity of Scala's immutable List, which I feel that I now understand more clearly.

Of course, I did need to add those two @SuppressWarnings annotations in EmptyTree, which I believe Scala avoids by having Nil (the singleton empty list) have a type parameter of Nothing (which inherits from everything -- if this sounds weird but intriguing, read up on some Scala, it actually all makes sense). I don't believe it's possible to do the same in Java, so I'm happy to at least be consistent with Apache Harmony.

Monday, March 28, 2011

Immutable Binary Trees

I have been reading a fair bit about functional programming recently, and the advantages you can get in a concurrent execution environment by using immutable data types (or persistent data structures).

While it's not too hard to wrap your head around an immutable list structure with an efficient prepend operation, I wanted to try implementing a slightly more complicated immutable structure (but not as complicated as a trie). So, I decided to try implementing an immutable binary tree in Java, and then work on improving it a bit.

Note that this is not a "good" implementation. In particular, I'm not going to bother rebalancing the tree, so it's very possible for many operations to be O(n) instead of the O(log n) that a balanced tree would provide.

The main idea that I had trouble fully grasping when learning about immutable data structures is the idea that you usually don't need to copy everything when adding/removing an element. So, let's look at an example of a binary tree in pictures and consider what elements need to be replaced and what can be reused when we add a new element to the tree.

Consider the following tree:



The rules are simple: at each node, values less than the current node's value are stored in the left subtree, while values greater are stored in the right subtree.

Now, suppose we want to add the number 11 to the tree. It should be added as a right child of the 10 node. Since the current 10 node has no children (and is itself an immutable tree), we must allocate a new 10 node with 11 as a right child. The new 10 node will be the left child of a new 13 node, which can reuse the old 16 node as its right child (along with the children of 16). Finally, a new 8 node must be allocated with its right child pointing to the new 13 node and its left child pointing to the existing 4 node.

Here is a picture with the newly-allocated nodes highlighted in red:



In total, four new nodes were allocated: one for each level in the tree. For a balanced tree, this generalizes to O(log n) node allocations, or the same runtime efficiency of a mutable binary tree. The remainder of the tree continues to point to the old values. Assuming we aren't holding onto a reference to the old tree root, the original 8, 13, and 10 nodes would now be eligible for garbage collection. (Though if another thread were accessing the old version of the tree, it would not be affected by this change, avoiding a potential race condition.)

Now, let's take a look at how to implement a simple immutable binary tree in Java.

To start with, I'll create a simple ImmutableSet interface (which doesn't follow the java.util.Set interface, since that is fundamentally based on mutability -- e.g. the "normal" add method has a void return type):



I've included a toList method for testing purposes, since it was easier than implementing the Iterable interface.

The fields and constructor of the ImmutableBinaryTree implementation are as follows:


All fields must be final in an immutable data structure.

Here is the implementation of the add method:


We create a leaf node for the new element and add it as a subtree to the existing tree. The addSubtree method is reused in the implementation of remove below, and could be used to implement a public addAll method that allows merging of two ImmutableBinaryTrees.



The remove method has two base cases (when we don't recurse into child branches). If the element is not in the tree, we return the unmodified this. If the current element is to be removed, we merge the left and right subtrees, arbitrarily choosing to add the right subtree as a descendent of the left (assuming the left subtree is not null, or else we simply return the right subtree).

The contains method is effectively the same as it would be in a mutable binary tree:


The toList method does an in-order traversal of the tree to return an ordered list:


For debugging purposes, and to visualize the tree depth, I decided to override toString:


Finally, here is a runner I used to compare the correctness of my implementation for addition/removal against the standard mutable TreeSet from the Java standard library:


There are several problems with this implementation that I plan to address in a future post:

  • The remove method creates a new node at each level, even if the element to remove does not exist.

  • The use of null to indicate the lack of a left or right child opens us up to the potential of NullPointerExceptions and wastes space from those fields. (In particular, in a balanced binary tree, half the elements will be leaf nodes.)
  • Currently, there is no way to create an empty tree, since every node has a value. If you look at the ImmutableTreeRunner implementation above, I had to generate my first random value before I could create the ImmutableBinaryTree.

Tuesday, January 11, 2011

Functional Programming Videos

I received a link from an email on the Scala mailing list to some excellent videos on functional programming:

http://cufp.org/videos

So far, I've watched "Naïveté vs Experience" and "Real world Haskell". Excellent viewing!