Taking a break (or possibly moving on) from Project Lambda posts, I've decided that it's a good time to look at pattern matching, a common approach in functional programming that leads to a similar end result as object-oriented inheritance polymorphism. That said, I'll use Project Lambda (JDK8) features when writing Java code (since it's much nicer). I'll also diverge from most of my previous posts by using Scala here, since the Java compiler doesn't support (elegant) pattern matching (though I've just discovered and may want to play around with JMatch, which apparently extends Java with pattern matching support).
An object-oriented example
For comparison, here is an object-oriented implementation of an immutable list in Java. An immutable list is either an
EmptyList (a singleton that holds nothing) or a
NonEmptyList (that holds a value and a reference to the rest of the list). To hide the implementation details from calling code, I've implemented the two cases as private static classes and exposed
empty() as a factory method to get the
EmptyList. The other operations (
map) are instance methods of
Algebraic Data Types
This linked list is a form of algebraic data type. Algebraic data types are made up of "products" (groups of fields — records or structs) or "sums" (disjoint unions) of other types. In particular, the
ImmutableList type is the sum of
NonEmptyList (since an instance comes from one set or the other), while
NonEmptyList is the product of its field types (the generic type
T for head and
ImmutableList<T> for tail).
EmptyList is a single-element set. Note that "product", in this context effectively means Cartesian product. Really, algebraic data types describe a set of possible values (in the mathematical definition of "set").
In mathematics, we tend not to talk about elements of a set "doing" things. Instead, we have functions and operators that act on elements of a set and produce other elements (either from the same set or from some other set). Thus, the object-oriented notion of member functions or methods doesn't really map well. Instead, we might define a function over a sum type by specifying partial functions over the underlying types and producing a resulting total function. We basically say, "If you live in this bucket, return something, whereas if you live in this other disjoint bucket, return something else".
In the orthodox functional programming world, this matching by partial functions is exactly what happens, by matching the "patterns" that describe the underlying types of a sum type. Before digging deeper, I think it helps to look at the same example above, written in Scala in a more purely functional way. (Since Scala was built to compile to JVM bytecode, it still has classes and objects, and our types are still effectively Java classes.)
There are several things to note in comparison to the Java code above:
NonEmptyListhas any methods. They don't "do" anything.
EmptyListis a singleton by virtue of the Scala
objectkeyword. It really is a single-element set.
sealed traitkeyword pair basically means "Everything that implements this interface is defined in this file". Since the
MyImmutableListinterface defines no methods, you can consider it purely a marker interface. Thus, we have the same level of implementation-hiding as we had with our private inner classes above. As a bonus, it's illegal for a class in another Scala file to announce that it implements
MyImmutableList, which cannot be said for our abstract class in the Java example above.
- The operations
mapare defined separately operating on a
map, the implementations need to decide which particular part of the disjoint union of the sum type applies. They do this by pattern matching against the underlying types. This is logically equivalent to inheritance polymorphism in the object-oriented example.
prependno pattern match is necessary. The logic is the same in both cases. This is equivalent to defining the implementation in the abstract base class in the object-oriented example.
- Pattern matching captures variables against the fields of the product type
NonEmptyList, such that we can use them on the right-hand side of the
caseexpression. By convention, if we don't care about a particular field, we use
_as the variable name.
In the end, it's another way of saying the same thing. That said, let's look at what each approach gives us. For this next bit, I'm totally stealing from Professor Dan Grossman and material I learned from his Programming Languages course on Coursera. As a "thank you" for that knowledge he gave to me for free, I encourage you to take his future courses and use up more of his time. I'm sure he will appreciate it.
First, let's consider the types and the operations, since that's what we want to implement:
In the object-oriented case, we basically implement the question marks by rows. That is, we implement one of
NonEmptyList, filling in all of the methods, and then implement the other. If a particular column (or most of it) shares code, we implement it in the base class (overriding when needed).
In the functional case (using pattern matching), we implement the question marks by column. For each operation, we describe how it will work on each of the underlying types. If a particular column doesn't need to distinguish between types, we don't.
Another way of looking at this table is that the columns are verbs, and the rows are nouns.
Which approach is better? Given that my blog is about little excluding functional programming, you might expect me to say "the functional approach". The answer, of course, is, "It depends". Given that we spend way more time modifying code, refactoring it, and updating it, you need to ask yourself, looking at the above table, "Will I be adding more rows, or will I be adding more columns?". In most cases, you'll be adding more columns — that is, describing more operations on your data. There are, however, cases where you add more rows. In particular, I believe that it's no mistake that object-oriented programming took over at the same time as the "GUI revolution". Specifically, most graphical elements (widgets) can be summed up by two operations:
handleEvent. In that case, you are more likely to be adding rows, as you define different kinds of widgets. That said, I consider it unfortunate that today's "server revolution" and service-oriented architectures continue to shoehorn things into object-orientation, even when better approaches exist (and predate object-oriented programming).
In short, what would you rather your code do? Make things (create nouns)? Or do things (create verbs)? In practice, I believe most of us would rather focus on building actions, but work in development environments where we develop objects.
For some fun reading, I suggest that anyone who works in an object-oriented language (and especially folks working in Java) read Execution in the Kingdom of Nouns, a nice tongue-in-cheek look at Java anti-patterns like
AbstractSingletonProxyFactoryBean (though I believe that article predates that particular monstrosity).
In the end, I am being quite disingenuous, and downright unfair to object-oriented programming. There are cases where grouping types under a single hierarchy (as one does in object-oriented programming) makes sense, even in the functional world. Specifically, my purely-functional Scala example is not how the Scala standard library implements immutable lists. Instead, they take an object-oriented approach, since it means that
map can be implemented in terms of base traits like
Iterable, so the same code can be reused for other types. Similarly, Haskell (the purest of semi-mainstream functional programming languages) has typeclasses, which are not object-oriented classes by any stretch of the imagination, but they allow you to define functions that take an instance of any type, so long as the type has certain associated functions that are defined in the given typeclass. (Okay... they're kind of like saying, "I will take an instance of any type that implements this particular interface".) All that to say, while I'm gently suggesting that the object-oriented mindset is not always the correct one, it still definitely has its place, even in the functional programming world.
Are you curious about what the above Scala example compiles to? It turns out that it's actually not very exciting (using Scala 2.9). It's pretty much like what you would hand-write in Java (except that the Scala compiler doesn't actually call Java 1.7's
Objects.equals, but rather inlines similar logic):