Friday, December 9, 2011

Intro to Functional Programming (with examples in Java) - Part 1

What is functional programming?

Quoting Wikipedia, "In computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data."

As someone with a math background, I like the simple, axiomatic ideas of the lambda calculus and the fact that you can derive fairly rich constructs from simpler ones (all using a relatively simple collection of operations).

Of course, like many developers these days, my programming experience has almost entirely been in the imperative, object-oriented world. Over the past year, I've done a lot of reading on functional programming in Scala, Clojure, Haskell, and Erlang, and have been solving online puzzles in Scala. Functional programming now makes sense to me from a practical standpoint. At the same time, whenever I try to explain some cool, short functional code to colleagues, their eyes glaze over, for what seem to be the following two reasons:

  • The syntax. Haskell or Lisp (including Clojure) look really different from languages with a C heritage. While Scala can be made to look a lot like Java, doing so tends to involve using its more object-oriented flavour, which defeats the purpose of a functional programming example. In my opinion, idiomatic Scala feels more like Haskell.
  • They are also usually not terribly excited, since they could generally do the same thing with some slightly more verbose imperative code, so they don't really see the point.

In this post (and a likely followup or two), I'll try to address the first point, by building up some functional programming constructs in Java. For the second point (at a later date), I'll try to show some examples where functional programming lets you focus more on what your code is doing, and less on the details of what the computer is doing.

First-class Functions

One thing that seems to be needed for any functional programming is the idea that functions are objects that can be passed around. In Java, of course, objects are instances of classes, so let's create our first function class in Java:

This is a single-purpose function class that, given a string will return its length. We can pass this around, have it as a member, store it as a map value, etc. Unfortunately, it only does one thing. Let's make a base class that generalizes this a bit further to all functions that take a String as input and return an Integer:

This is more useful. We could imagine having a method that acts on a Collection of Strings and wants to produce an Integer for each one. We could then pass it any FunctionStringToInteger as a parameter, independent of how the Integers are produced from the Strings.

Let's generalize this further, using Java generics to handle any function of one argument. In this case, we want to generalize on the input type T1 and the return type R. (I'm using T1 instead of T, since we'll soon be looking at functions that take multiple parameters, which could each be of different type.)

While the generic parameters can be a little messy, we can now model any function of one parameter as a subclass of this Function1 class. Here's a simple example:

Of course, having the function as an object and invoking it directly doesn't really say anything about first-class functions (and mostly just looks like a much more verbose way of calling a method). Lets write a method that takes a function as a parameter, so the first-classedness becomes more useful:

Now we can call transformList with a list of elements and a function that transforms individual elements and get back a list of all of the transformed elements. This is actually a fairly common pattern, so transformList is part of the standard library in a lot of functional languages (though it's usually called "map" -- we'll switch to calling it map in part 2).

We can also define classes for functions that take more than one parameter:

That's cool, but isn't it annoying that you need to make FunctionN classes for each number of parameters that you want? Yes, but you define the classes once and then you use them forever, or ideally they're already defined in a library (e.g. Functional Java defines classes F, F2, ..., F8, and the Scala standard library defines classes Function1, ..., Function22). For this blog, we'll do things the hard way, and implement everything ourselves. (In particular, I will be adding more functionality to the base FunctionN classes in part 2.)

Higher-order Functions

Once you have first-class functions, it's not hard to imagine creating functions that take functions as parameters. Above, we defined the transformList method that took a function as one of its parameters, but methods aren't functions as we've defined them (that is, methods are not first-class in Java). Here's a silly example of a higher-order function with a very long name:

Why don't we turn transformList into a function? Then we could pass it to other functions/methods to build up more complicated things. Unfortunately, this is an area where Java makes things more of a pain for functional language programming. I think it's related to Java not having "higher-kinded" types.

The basic problem is that, while you can write a generic Function2 that has a Function1 as a type parameter, you don't necessarily know the R, T1 type parameters for that Function1. What we would really like to do would be to write a type-safe Function2 that can take any Function1 as a parameter, regardless of what that Function1's types are, and without necessarily having that Function1 when creating the Function2. Unfortunately, I don't think that's entirely possible in Java, but we're going to come up with a way of coming close.

Let's try to define a generic transformList in a naive way:

Well, that's not going to work. If you paste that code snippet into an IDE (as I just did), all of the Rs and T1s should be highlighted in red. There is no context to define what these type parameters should be in the general case. So, what can we do to get around this? One approach, if you're into taking risks, is to just drop the unbound type parameters:

That will compile, and you get the correct result if you pass the right types of parameters. Unfortunately, if you call apply with a list of Integers and a function that takes something else (e.g. String) as a parameter, you won't notice your mistake until runtime (when a ClassCastException is thrown). You will get a compiler warning regarding unchecked calls within the definition of the apply method of rawTransformList, but that is it.

If we cannot directly define a generic, statically-typed higher-order function for transformList, and a raw version is not safe, maybe we can define a generic factory method that builds these statically-typed functions by providing a "hint" of the type parameters:

This is looking better. We don't need to know exactly which Function1 will be passed to transformList, but we do need to know the input and return types for that Function1. Unfortunately, this will create a new Function2 every time we call it. We could hold onto transformList instances for every input/return type pair that we use, but that's still wasteful (especially since the type parameters will be erased by the compiler, and the stateless instances will all be identical at runtime).

I believe the best approach is to combine the above two approaches. Use a private static raw instance singleton, which is returned by a generic public factory method:

This gives you a Function2 that statically type-checks its inputs and outputs, despite the fact that it's the same one being used for all type parameters. Because of Java's compile-time type erasure, it makes no difference at runtime.

What's next?

I think this post is getting long enough. We've gone over the basics of first-class and higher-order functions. In the next posts, I'll add currying to the functions and talk about the idea of everything (conceptually) being a function.

The code for this series of blog posts is available in my public github repository.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.