Tuesday, January 29, 2013

Project Lambda - First Impressions

Before launching into the actual post, I would like to make a quick plug for Martin Odersky's "Functional Programming Principles in Scala" course on Coursera. It will begin March 25th and run for 7 weeks. I took the course during its previous run (September-November last year), and really enjoyed it. No prior knowledge of Scala is required. The material is well-presented and quite interesting. Sign up now!

Moving on to today's topic

I created an impromptu poll on Google+ to see what I should write about next, not really expecting any suggestions (especially since most of my Google+ friends only seem to check in roughly once per month or so). As it happens, I got a vote for Project Lambda, the upcoming support for functional programming features planned for JDK8. So, with that unanimous support, I downloaded a prerelease (b74) and started porting the functional programming framework I've been developing in earlier posts.

Mapping concepts

Taking a look at that new java.util.function package, you find a fair number of interfaces (I count 38 in b74), but they're mostly specializations of the following few core interfaces:

  • Supplier: Declares a get() method that returns a value. Effectively interface-equivalent to Function0 from my previous posts. There are specialized versions (BooleanSupplier, DoubleSupplier, IntSupplier, LongSupplier, but no love for floats) to return unboxed primitive values.
  • Function: Has an apply method that takes an input of type T and returns an output of type R. Analogous to Function1 from my previous posts. Has subinterface UnaryOperator (for functions where the return type is the same as the input type), and many primitive type specializations (to avoid boxing), including Predicate (for functions that return boolean), IntFunction, LongFunction, DoubleFunction, plus many combinations of single-input functions for different input/output types (where at least one is primitive).
  • BiFunction: Analogous to the Function2 type I described in earlier posts, these take two arguments and return a value (where, in the most general case, each input and output type may be different). Has a subinterface BinaryOperator for the special case when all three types are the same, plus specialized versions of BinaryOperator for each of the primitive types.
  • Block and BiBlock: Like Function and BiFunction, but covering the case where nothing is returned (so the code must produce a side-effect). I handled these cases in my previous post by using functions that return the empty tuple (Tuple0). Again, there are several specialized interfaces to handle primitive inputs to the block.

Lambda syntax

By far, the biggest change you'll find in JDK8 from Project Lambda is the new syntax added to make working with the above interfaces much less verbose. (My previous posts serve as an example of how ugly it can be to create anonymous classes to wrap up a little bit of logic in a function object.) Here is a "before and after" example, using an anonymous class versus the lambda syntax:

I think we can all agree that the second conveys just as much useful information without the noise of the anonymous class boilerplate. IntelliJ even gives me a warning, telling me that the first version can be replaced with a lambda. This syntax can be applied to any anonymous implementation of an interface where there's a single method to implement (which includes a number of AWT or Swing EventListeners, as well as Runnable and Callable). I would argue that the lambda syntax will be pretty useful even if you're not interested in functional programming.

Since many existing codebases probably have static utility methods that behave as functions, there is also syntax to use these methods as function objects:

Similarly, an instance method without arguments can implicitly be treated as a function where the instance is the input and the result of invoking the method on the instance is the output (note that IntFunction<String> takes a String as input and returns an int):

Default methods

If you paid close attention to my previous posts (or looked at the code in GitHub), you may have noticed that most of my function types were classes. This was because I wanted to be able to implement certain helper methods that would work if you provided the actual logic of a function. For example, Function1 has a compose method that would allow you to define composition of functions. Similarly, the function classes have the logic for partial application and currying.

This is kind of annoying, as you can only inherit from a single class, whereas you can implement multiple interfaces. It would be much nicer to be able to define some logic that comes "for free" once the method(s) of an interface are defined. In particular, this is how Scala's traits work (and was one of the features that made my Why I like Scala list). As it happens, this is another feature coming to Java with Project Lambda.

There is a new keyword, default, that can be added before an interface method that allows you to define a body for the method. From what I can tell, while this introduces some multiple inheritance of behaviour to Java, it still isn't quite as flexible as Scala's multiple inheritance of traits (which avoids the diamond problem by defining a linearization of inheritance). For example, the following doesn't compile for me:

The compiler error is

java: class DefaultTest.HelloWorld inherits unrelated defaults for print() from types DefaultTest.Hello and DefaultTest.World

That said, default methods still allow some rich inheritance of interfaces, and could be used to significantly improve the collections API (much as the use of traits has given Scala a much better collections API). Based on this FAQ answer, which refers to new interfaces java.util.Sized and java.util.Traversable, it looks like this cleanup was planned at some point, but those interfaces don't exist in the b74 build of Project Lambda. I don't know if they backtracked on improving collections, or if those changes are excluded from current builds.

New collection types

In a previous post, I covered the Option type (taking the name from Scala). As it happens, Project Lambda includes the java.util.Optional type, which covers pretty much the same niche.

They've also added a hefty java.util.stream package that handles lazily-evaluated sequences, complete with evaluation pipelines (designed to accommodate concurrent evaluation). I won't go into details on streams, for the following reasons:

  • I want to discuss streams (as a type) in a later post, since they're pretty easy to implement.
  • This post is already going to be too long, and streams would roughly double the length.
  • I haven't had time to fully investigate pipelines, so anything I write wouldn't be doing them justice.

What's missing?

So far, Project Lambda doesn't seem to include helpers for partial application of functions, or lazy evaluation. Luckily, those are both pretty easy to implement, and the syntax is much nicer with the new syntax.

Lazy Evaluation

To implement lazy evaluation, let's revisit the Function0 idea, but this time, we'll make it an interface. Of course, interfaces still can't hold data, so we'll "remember" earlier evaluations using a static WeakHashMap (relying on identity equality) and take advantage of the hot new computeIfAbsent default method added to Map.

Next, let's create a Function1 that extends Function and adds the ability to take a Supplier (which may or may not be a Function0) as input to apply. This will allow us to eagerly evaluate a previous lazy result. Whereas previously, I made the return value always lazy (where you could then force evaluation with get()), the built-in Function interface we're extending performs eager evaluation. So, we'll expose lazy evaluation via the applyLazy methods. I'll use the T,R ordering of type arguments, since my earlier R,T ordering was silly and inconsistent with what everyone else does.

Partial Application

In my previous implementation, I had an abstract base class called Curryable which implemented partial application in terms of currying. Then, it was the responsibility of each FunctionN class to implement curry to produce a Function1 that binds the first argument and returns a FunctionN-1. Unfortunately, while writing this post, I discovered a fatal flaw in that implementation that actually invalidates one of my examples in my previous post. Specifically, that implementation of partial evaluation always evaluated the partial arguments eagerly, so one of the page downloads started in the middle of constructing the lazy result of the Function2. (In retrospect, I should have been suspicious when I found that my asynchronous implementation was consistently faster than my earlier concurrent implementation. Of course it was, it had already downloaded a page before the timer started.) We'll do partial application more directly this time, so that laziness works.

Anyway, let's define Function2 in terms of BiFunction, with the additional partial application methods. Function3 and up are analogous, but need to be defined from scratch, since there are no built-in interfaces for functions of more than two arguments.

Let's confirm that this works, and play some more with lambda syntax by implementing foldLeft as a static method, binding it to a function object (a Function3, specifically), and then using partial application to produce a Function1 that concatenates strings:

That's considerably easier than the hoops I jumped through to implement a higher-order version of foldLeft previously.

Conclusions

So far, Project Lambda looks like it could actually make functional programming much more viable on Java. While my examples here haven't done anything that I haven't been able to do with my existing toy framework, the syntax alone is a massive improvement. As an example of the impact of good syntax, I immediately spotted my mistake with my earlier Curryable implementation when I started to write it with lambdas. Previously, I had missed the mistake amidst all of the anonymous function boilerplate.

I'm still not convinced that it's on par with Scala, though, since it largely seems to be catching up to where Scala was a few years ago, and you're still stuck with the old "everything is mutable" collections framework. Various features of the Scala compiler (e.g. type inference, special handling of apply) allow you to write even less boilerplate than what you get with Project Lambda. Still, if Java developers get used to this stuff (and like it), it might make it easier to convince more of them to drop the uglier legacy parts of Java and move to Scala (which still keeps the good old JVM and JIT compiler).

That said, since I imagine that I'll be stuck mostly working in Java for the foreseeable future, I can only hope that JDK8 comes out soon, so I can use these goodies in my professional life.

Finally, remember that what I've covered here is only scratching the surface. This is what I've been able to pick up playing with Project Lambda in about a week, between when my daughter falls asleep and heading to bed myself. For more details, check out the Project Lambda homepage, and the resources linked from there.


No comments:

Post a Comment

Note: Only a member of this blog may post a comment.