While it's not too hard to wrap your head around an immutable list structure with an efficient prepend operation, I wanted to try implementing a slightly more complicated immutable structure (but not as complicated as a trie). So, I decided to try implementing an immutable binary tree in Java, and then work on improving it a bit.
Note that this is not a "good" implementation. In particular, I'm not going to bother rebalancing the tree, so it's very possible for many operations to be
O(n)
instead of the O(log n)
that a balanced tree would provide.The main idea that I had trouble fully grasping when learning about immutable data structures is the idea that you usually don't need to copy everything when adding/removing an element. So, let's look at an example of a binary tree in pictures and consider what elements need to be replaced and what can be reused when we add a new element to the tree.
Consider the following tree:
The rules are simple: at each node, values less than the current node's value are stored in the left subtree, while values greater are stored in the right subtree.
Now, suppose we want to add the number 11 to the tree. It should be added as a right child of the 10 node. Since the current 10 node has no children (and is itself an immutable tree), we must allocate a new 10 node with 11 as a right child. The new 10 node will be the left child of a new 13 node, which can reuse the old 16 node as its right child (along with the children of 16). Finally, a new 8 node must be allocated with its right child pointing to the new 13 node and its left child pointing to the existing 4 node.
Here is a picture with the newly-allocated nodes highlighted in red:
In total, four new nodes were allocated: one for each level in the tree. For a balanced tree, this generalizes to
O(log n)
node allocations, or the same runtime efficiency of a mutable binary tree. The remainder of the tree continues to point to the old values. Assuming we aren't holding onto a reference to the old tree root, the original 8, 13, and 10 nodes would now be eligible for garbage collection. (Though if another thread were accessing the old version of the tree, it would not be affected by this change, avoiding a potential race condition.)Now, let's take a look at how to implement a simple immutable binary tree in Java.
To start with, I'll create a simple
ImmutableSet
interface (which doesn't follow the java.util.Set
interface, since that is fundamentally based on mutability -- e.g. the "normal" add
method has a void
return type):I've included a
toList
method for testing purposes, since it was easier than implementing the Iterable
interface.The fields and constructor of the ImmutableBinaryTree implementation are as follows:
All fields must be final in an immutable data structure.
Here is the implementation of the
add
method:We create a leaf node for the new element and add it as a subtree to the existing tree. The
addSubtree
method is reused in the implementation of remove
below, and could be used to implement a public addAll
method that allows merging of two ImmutableBinaryTree
s.The
remove
method has two base cases (when we don't recurse into child branches). If the element is not in the tree, we return the unmodified this
. If the current element is to be removed, we merge the left and right subtrees, arbitrarily choosing to add the right subtree as a descendent of the left (assuming the left subtree is not null, or else we simply return the right subtree).The
contains
method is effectively the same as it would be in a mutable binary tree:The
toList
method does an in-order traversal of the tree to return an ordered list:For debugging purposes, and to visualize the tree depth, I decided to override
toString
:Finally, here is a runner I used to compare the correctness of my implementation for addition/removal against the standard mutable
TreeSet
from the Java standard library:There are several problems with this implementation that I plan to address in a future post:
- The remove method creates a new node at each level, even if the element to remove does not exist.
- The use of
null
to indicate the lack of a left or right child opens us up to the potential ofNullPointerException
s and wastes space from those fields. (In particular, in a balanced binary tree, half the elements will be leaf nodes.) - Currently, there is no way to create an empty tree, since every node has a value. If you look at the
ImmutableTreeRunner
implementation above, I had to generate my first random value before I could create the ImmutableBinaryTree.
Hey there. I've been playing with immutable trees as well, and there's a decent solution to the Empty Tree problem if you model your tree as a few related classes:
ReplyDelete1: An Abstract Tree (or interface)
2: A Branch (representing a node with a value)
3: A Leaf (representing a node without a value)
An empty tree is a Leaf. Adding a value to a Leaf creates a new Branch with that value, and two new Leafs for Left/Right.
Adding to a branch always delegates to the Add method of the Left or Right child (which will eventually be a Leaf, which will do the actual creation of a new Branch node).
Hope that's somewhat useful. (Also, I appear to be raising the dead with this comment. Oops)
Hi Kevin,
ReplyDeleteThe branch/empty approach is far better than the use of nulls I described in this post (and it's consistent with e.g. the binary tree type described in Real World Haskell (find in page for "binary tree")), and is more practical than the approach I took in my second post on immutable binary trees.
That said, I still kind of like that approach (with concrete classes for DualBranch, LeftBranch, RightBranch, Leaf, and Empty), since every member variable is essential (that is, you never have a null or empty member), and the lack of one or both children is given by the class reference (which is mandatory in Java). A tree consisting purely of left branches in that approach ends up using as much memory as a linked list, I believe.
If I were implementing this for the real world, I think I would take the approach you describe, though.
Thanks for reading!