On my Twitter feed, I recently got a link to this document by Brian Goetz explaining how JDK8 translates lambdas into bytecode. I was surprised, since I had assumed that lambdas were simply syntactic sugar for anonymous inner classes that implement one of the functional interfaces, much as anonymous functions in Scala currently work. That is, I thought that Project Lambda was basically doing the same thing as I've been doing with my functional types, but eliminating the annoying boilerplate that goes along with an anonymous class and using better type inference.
It turns out that it's considerably more interesting than that, and makes use of the InvokeDynamic bytecode instruction added in JDK7. It inspired me to dig a little deeper with
javap and try writing up an explanation in simpler terms (and only covering the simple case), since that article is pretty heavy (though I have to admit that I spent a solid hour on a Saturday night going through it with glee — I'll claim that my standards for a fun Saturday night have lowered considerably with a small child at home, rather than admitting that I'm a huge nerd).
How don't lambdas work?
I think it's clearer to understand how lambdas work by first considering how they don't work. That is, they're not just a more succinct way of writing anonymous inner classes.
Throughout the rest of this post, I'll use the following example:
While only a few lines in total, this example could desugar to something a fair bit messier. It's useful to note that our simple lambda is capturing the variable
i from its scope.
Suppose lambdas were simply a nicer way to write anonymous classes. The above example would be equivalent to:
If you've ever looked at the output directory from compiling code with anonymous functions, you would notice that the above code would compile to
AnonymousFunction$1.class. If we run
javap AnonymousFunction\$1.class (escaping the
$ so the shell doesn't think I'm referencing a variable), we get the following output:
So, we have a class that holds the variable
i as a field (passed to the constructor), and two versions of the
apply method, with and without type erasure. (The
Object version just casts its input to an
Integer and invokes the
More explicitly, anonymous functions are also a kind of syntactic sugar. In effect,
AnonymousFunction.java compiles down to something like the following:
The output of running
javap AnonymousFunction\$1.class looks nearly identical (save the
1 difference) to
So, how do lambdas actually work?
It's worth noting that the contents of my compiled output directory (with file sizes) are as follows:
Interesting... no secondary classes are produced by compiling the original
SimpleLambda code. So, how does it work?
The code of the lambda is moved to a private static method called (in my case)
lambda$6, that takes
i as parameters and does the addition. That's the easy part. What does
main actually compile to? Let's ask
javap (using the
I think of this bytecode as divided into three parts:
- Bytecode lines (distinct from GitHub gist lines) 0 through 6 take care of assigning local variable
iand putting it on the stack.
- Lines 7, 8, and 13 load
ifrom the stack, do an
invokedynamicto actually create the
Function, then push it onto the stack.
- Lines 14 through 27 take care of retrieving the constant
System.out, loading the
Functionfrom the stack, boxing
Integer, calling our
Function.applymethod, and calling
We're most interested in line 8, which triggers the creation of the
Function. Let's see what
#3 points to (near the top of the
#3 = InvokeDynamic #0:#40 // #0:lambda$:(Ljava/lang/Integer;)Ljava/util/function/Function;
So, it's pointing to
#0 with the expectation of a
CallSite that takes an
Integer and returns a
Function (effectively a
Function factory). Let's look at
This is basically binding the last three arguments for a call to
instantiatedMethodType) to our desired "single abstract method" (
#37 Function.apply), the implementation method (
#38, the previously-mentioned private static
lambda$6), and the type arguments for the
Function.apply method (
#39 — it takes an
Integer (within parentheses) and returns an
Integer (outside the parentheses)). The first three arguments are supplied by the
invokedynamic call itself. (I believe arguments 2 and 3 are
(Ljava/lang/Integer;)Ljava/util/function/Function are bound by
Right now, if I look at the code for
LambdaMetaFactory.metaFactory, it's basically constructing a new
InnerClassLambdaMetafactory and asking it to generate a
CallSite. So, at runtime, we're generating an inner class that implements
Function<Integer, Integer>, takes an
Integer as a constructor parameter, and implements
apply by calling
lambda$6 with its field and the argument to
Let's confirm this by creating a class that outputs this stuff at runtime:
And here is the output from running that:
In this case, our static method got called
lambda$0, but you can see that it's called from the
apply method of an inner class that holds an
Integer field, which it probably received through its constructor (since it's a
final field). Effectively, at runtime, we're getting the behaviour from
InnerClassFunction.java from above (except that the actual code makes one more method call to the static method), which in turn is the same as
AnonymousFunction.java, which is not what lambdas become. The one "major" difference is that the generate class doesn't implement the
Integer version of
apply, sticking purely to the erased version (which makes sense, since the checked cast is already required before calling
In a very roundabout way, I hope that I've shown you that lambdas are not equivalent to anonymous classes (which are equivalent to inner classes that capture local variables via constructor arguments), but they are actually equivalent at runtime... for now. The truth is that this somewhat convoluted
invokedynamic approach leaves things open for improvements in the future.
The current implementation of Project Lambda (at the time of writing) does, in fact, generate inner classes at runtime. However, by using the
invokedynamic call to
LambdaMetadataFactory, future runtimes will be able to provide a different implementation that may have better performance, and still work with code compiled now (or at least after JDK8 eventually ships). As a secondary bonus, by not performing this expansion at compile time, compile time should be shorter and the generated bytecode will be smaller. (Even with this trivial example,
SimpleLambda compiled to 20% less bytecode than
InnerClassFunction. I believe the difference would be more significant with more lambdas.)
I hope that this post has made you think more about how lambdas will work, along with how anonymous classes and inner classes already work.
It's worth noting that Scala currently compiles functions to inner classes, though there has been some discussion about using the
invokedynamic approach there as well.