Thursday, February 7, 2013

More fun with Project Lambda

What's new in Project Lambda this week?

It seems that the Project Lambda developers have conspired to ensure that my blog will always be out of date. I recently wrote about my first impressions from using Project Lambda, based on build b74. Now that I'm writing this, b75 is out and they've renamed the Block, IntBlock, DoubleBlock, and various BiBlock interfaces to replace the word Block with Consumer. I suppose that's more consistent with Supplier, since a Consumer consumes a value and returns nothing, while a Supplier takes no input and produces something. It also ties in well with the pipeline metaphor they're using for Streams (which will still need to wait for a later post, since I got playing with default methods in interfaces this week).

Additionally, they've added several more interfaces to java.util.function to accommodate Functions and BiFunctions that return primitive values (still to avoid the overhead of boxing, presumably). It's interesting to compare these specialized (and specially-named) interfaces to Scala's use of the @specialized annotation (discussed, here, for example) to avoid the same problem.

Pimping Collections

I'm horribly abusing a term that has grown popular in the Scala community for the case where you appear to add functionality to an existing type by providing an implicit view that converts objects of the existing type to objects of the enhanced type. This is referred to as the pimp my library pattern (going back several years, including by Prof. Odersky here). What I'm doing here isn't implicit, but it does show a way that we can use interfaces with default methods to layer functionality onto library types with relatively little ceremony.

In some of my previous posts, (like this one and especially this one), I talked about some nice higher-order functions on collections that most functional programming language environments provide as part of their standard library. The fact is that most of these higher-order functions can be implemented in terms of iterating through elements. Since Java 1.5, the standard collections (except Map and its implementations) all implement the Iterable interface. Wouldn't it be great if we could add these higher-order functions as methods on existing collection types? We can't really do that (without compiling our own version of the standard library), but we can inject the iterator method from the builtin types as a lambda to construct a type that does have this functionality.

Yo dawg, I herd you like iterators

Let's extend the Iterable interface with an AugmentedIterable interface that defines additional functionality based on the iterator method:

public interface AugmentedIterable<T> extends Iterable<T> {
default <R> R foldLeft(R init, BiFunction<? super R, ? super T, ? extends R> f) {
R out = init;
for (T elem : this) {
out = f.apply(out, elem);
}
return out;
}
default <R> AugmentedIterable<R> map(Function<? super T, ? extends R> f) {
final Iterable<T> origIterable = this;
return () -> new Iterator<R>() {
final Iterator<T> origIter = origIterable.iterator();
@Override
public boolean hasNext() {
return origIter.hasNext();
}
@Override
public R next() {
return f.apply(origIter.next());
}
};
}
default AugmentedIterable<T> filter(Predicate<? super T> p) {
final Iterable<T> origIterable = this;
return () -> new Iterator<T>() {
final Iterator<T> origIter = origIterable.iterator();
T next;
boolean hasNext = false;
@Override
public boolean hasNext() {
if (hasNext) return true;
while (origIter.hasNext()) {
next = origIter.next();
if (p.test(next)) {
hasNext = true;
return true;
}
}
return false;
}
@Override
public T next() {
if (hasNext) {
hasNext = false;
return next;
}
throw new NoSuchElementException();
}
};
}
default <R> AugmentedIterable<R>
flatMap(Function<? super T, ? extends Iterable<? extends R>> f) {
final Iterable<T> origIterable = this;
return () -> new Iterator<R>() {
final Iterator<T> origIter = origIterable.iterator();
Iterator<? extends R> nestedIterator = null;
@Override
public boolean hasNext() {
while (nestedIterator == null || !nestedIterator.hasNext()) {
if (origIter.hasNext()) {
nestedIterator = f.apply(origIter.next()).iterator();
}
}
return (nestedIterator != null && nestedIterator.hasNext());
}
@Override
public R next() {
return nestedIterator.next();
}
};
}
default AugmentedIterable<T> take(int n) {
final Iterable<T> origIterable = this;
return () -> new Iterator<T>() {
final Iterator<T> origIter = origIterable.iterator();
private int pos = 0;
@Override
public boolean hasNext() {
return pos < n && origIter.hasNext();
}
@Override
public T next() {
pos++;
return origIter.next();
}
};
}
default AugmentedIterable<T> drop(int n) {
final Iterable<T> origIterable = this;
return () -> {
final Iterator<T> origIter = origIterable.iterator();
for (int i = 0; i < n; i++) {
if (origIter.hasNext()) {
origIter.next();
} else {
break;
}
}
return origIter;
};
}
// I'm using this for testing, but it's also generally useful
default List<T> toList() {
final List<T> out = new ArrayList<>();
forEach(out::add);
return out;
}
}

As you can see, many of the methods return a new AugmentedIterable whose iterator method (defined by a lambda) delegates to the iterator method of the original AugmentedIterable. I put an iterator in your iterator, so you can iterate while you iterate.

Note that this implementation with delegated iterators has the side-effect of making most of these methods lazy (except toList and foldLeft, which don't produce new AugmentedIterables).

Let's add a quick utility method to construct an AugmentedIterable from an existing Iterable, and test our methods:

public final class AugmentedIterableUtils {
private AugmentedIterableUtils() {
}
public static <T> AugmentedIterable<T> augment(Iterable<T> iterable) {
return iterable::iterator;
}
}
public class AugmentedIterableTest {
@Test
public void testFoldLeft() throws Exception {
assertEquals(10,
augment(Arrays.asList(1, 2, 3, 4)).foldLeft(0, (a, b) -> a + b).intValue());
}
@Test
public void testMap() throws Exception {
assertEquals(Arrays.asList(2, 4, 6, 8),
augment(Arrays.asList(1, 2, 3, 4)).map((a) -> a * 2).toList());
}
@Test
public void testFilter() throws Exception {
assertEquals(Arrays.asList(2, 4, 6, 8),
augment(Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8))
.filter((a) -> a % 2 == 0).toList());
}
@Test
public void testTake() throws Exception {
assertEquals(Arrays.asList(1, 2, 3, 4),
augment(Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8)).take(4).toList());
}
@Test
public void testDrop() throws Exception {
assertEquals(Arrays.asList(5, 6, 7, 8),
augment(Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8)).drop(4).toList());
}
@Test
public void testLazyMap() throws Exception {
// Confirm that evaluation is lazy
final AtomicInteger counter = new AtomicInteger(0);
Function<Integer, Integer> doubleWithSideEffect = (a) -> {
counter.incrementAndGet();
return a * 2;
};
Predicate<Integer> isOddWithSideEffect = (a) -> {
counter.incrementAndGet();
return a % 2 == 1;
};
AugmentedIterable<Integer> result =
augment(Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8))
.filter(isOddWithSideEffect)
.map(doubleWithSideEffect);
assertEquals(0, counter.get());
assertEquals(Arrays.asList(2, 6, 10, 14), result.toList());
// 8 calls to the predicate and 4 calls to mapping function
assertEquals(8 + 4, counter.get());
}
// A simple take on the rather dubious straw man argument given at
// http://cafe.elharo.com/programming/java-programming/why-functional-programming-in-java-is-dangerous/
@Test
public void testSquaresOfIntegers() throws Exception {
// An infinite iterable that generates positive integers
AugmentedIterable<Integer> integers = () -> new Iterator<Integer>() {
int i = 1;
@Override
public boolean hasNext() {
return true;
}
@Override
public Integer next() {
return i++;
}
};
assertEquals(Arrays.asList(1, 4, 9, 16, 25, 36, 49, 64),
integers.map((a) -> a * a).take(8).toList());
}
}

Conclusions

This still isn't quite as nice as the "pimp my library" pattern, as we still need either the explicit call to augment to convert our Collection into an AugmentedIterable. Furthermore, I didn't really need an interface with default methods to do this, as I believe I could have made AugmentedIterable an abstract class. Where things get interesting is that we could add more functionality derived from Iterable in another interface and then mix them together in a new interface that inherits from both. Alternatively, I could create new collection types that extend the built-in types but also add the AugmentedIterable behaviour:

public class AugmentedLinkedList<T> extends LinkedList<T>
implements AugmentedIterable<T> {
// Expose constructors from superclass
public AugmentedLinkedList() {
}
public AugmentedLinkedList(final Collection<? extends T> c) {
super(c);
}
}
public class AugmentedArrayList<T> extends ArrayList<T>
implements AugmentedIterable<T> {
// Expose constructors from superclass
public AugmentedArrayList(final int initialCapacity) {
super(initialCapacity);
}
public AugmentedArrayList() {
}
public AugmentedArrayList(final Collection<? extends T> c) {
super(c);
}
}