While it's not too hard to wrap your head around an immutable list structure with an efficient prepend operation, I wanted to try implementing a slightly more complicated immutable structure (but not as complicated as a trie). So, I decided to try implementing an immutable binary tree in Java, and then work on improving it a bit.
Note that this is not a "good" implementation. In particular, I'm not going to bother rebalancing the tree, so it's very possible for many operations to be
O(n)
instead of the O(log n)
that a balanced tree would provide.The main idea that I had trouble fully grasping when learning about immutable data structures is the idea that you usually don't need to copy everything when adding/removing an element. So, let's look at an example of a binary tree in pictures and consider what elements need to be replaced and what can be reused when we add a new element to the tree.
Consider the following tree:

The rules are simple: at each node, values less than the current node's value are stored in the left subtree, while values greater are stored in the right subtree.
Now, suppose we want to add the number 11 to the tree. It should be added as a right child of the 10 node. Since the current 10 node has no children (and is itself an immutable tree), we must allocate a new 10 node with 11 as a right child. The new 10 node will be the left child of a new 13 node, which can reuse the old 16 node as its right child (along with the children of 16). Finally, a new 8 node must be allocated with its right child pointing to the new 13 node and its left child pointing to the existing 4 node.
Here is a picture with the newly-allocated nodes highlighted in red:

In total, four new nodes were allocated: one for each level in the tree. For a balanced tree, this generalizes to
O(log n)
node allocations, or the same runtime efficiency of a mutable binary tree. The remainder of the tree continues to point to the old values. Assuming we aren't holding onto a reference to the old tree root, the original 8, 13, and 10 nodes would now be eligible for garbage collection. (Though if another thread were accessing the old version of the tree, it would not be affected by this change, avoiding a potential race condition.)Now, let's take a look at how to implement a simple immutable binary tree in Java.
To start with, I'll create a simple
ImmutableSet
interface (which doesn't follow the java.util.Set
interface, since that is fundamentally based on mutability -- e.g. the "normal" add
method has a void
return type):
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import java.util.List; | |
public interface ImmutableSet<T extends Comparable<T>> { | |
ImmutableSet<T> add( T element ); | |
ImmutableSet<T> remove( T element ); | |
boolean contains( T element ); | |
List<T> toList(); | |
} |
I've included a
toList
method for testing purposes, since it was easier than implementing the Iterable
interface.The fields and constructor of the ImmutableBinaryTree implementation are as follows:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
public class ImmutableBinaryTree<T extends Comparable<T>> implements ImmutableSet<T> { | |
private final ImmutableBinaryTree<T> left; | |
private final ImmutableBinaryTree<T> right; | |
private final T value; | |
public ImmutableBinaryTree(ImmutableBinaryTree<T> left, | |
ImmutableBinaryTree<T> right, T value) { | |
this.left = left; | |
this.right = right; | |
this.value = value; | |
} |
All fields must be final in an immutable data structure.
Here is the implementation of the
add
method:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@Override | |
public ImmutableBinaryTree<T> add(T element) { | |
if ( value.compareTo(element) == 0 ) { | |
return this; // Element already exists | |
} | |
return addSubtree(new ImmutableBinaryTree<T>(null, null, element)); | |
} | |
private ImmutableBinaryTree<T> addSubtree( ImmutableBinaryTree<T> subTree ) { | |
if ( subTree == null ) | |
{ | |
return this; | |
} | |
if ( value.compareTo(subTree.value) == 0 ) { | |
// Need to merge subTree children with this tree's children | |
final ImmutableBinaryTree<T> newLeft = left !=null ? left.addSubtree(subTree.left) : subTree.left; | |
final ImmutableBinaryTree<T> newRight = right !=null ? right.addSubtree(subTree.right) : subTree.left; | |
return new ImmutableBinaryTree<T>(newLeft, newRight, value); | |
} else if ( value.compareTo(subTree.value) < 0 ) { | |
// Element goes on right subtree | |
if ( right != null ) { | |
return new ImmutableBinaryTree<T>(left, right.addSubtree(subTree), value); | |
} else { | |
return new ImmutableBinaryTree<T>(left, subTree, value); | |
} | |
} else { | |
// Element goes on left subtree | |
if ( left != null ) { | |
return new ImmutableBinaryTree<T>(left.addSubtree(subTree), right, value); | |
} else { | |
return new ImmutableBinaryTree<T>(subTree, right, value); | |
} | |
} | |
} |
We create a leaf node for the new element and add it as a subtree to the existing tree. The
addSubtree
method is reused in the implementation of remove
below, and could be used to implement a public addAll
method that allows merging of two ImmutableBinaryTree
s.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@Override | |
public ImmutableBinaryTree<T> remove(T element) { | |
if ( value.compareTo(element) == 0 ) { | |
if ( left == null && right == null ) | |
return null; | |
if ( left == null ) return right; | |
if ( right == null ) return left; | |
return left.addSubtree(right); | |
} | |
if ( value.compareTo(element) < 0 && right != null ) { | |
return new ImmutableBinaryTree<T>(left, right.remove(element), value); | |
} | |
else if ( value.compareTo(element) > 0 && left != null ) { | |
return new ImmutableBinaryTree<T>(left.remove(element), right, value); | |
} | |
return this; // Element not found | |
} |
The
remove
method has two base cases (when we don't recurse into child branches). If the element is not in the tree, we return the unmodified this
. If the current element is to be removed, we merge the left and right subtrees, arbitrarily choosing to add the right subtree as a descendent of the left (assuming the left subtree is not null, or else we simply return the right subtree).The
contains
method is effectively the same as it would be in a mutable binary tree:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@Override | |
public boolean contains(T element) { | |
if ( value.equals( element) ) { | |
return true; | |
} | |
else if ( value.compareTo(element) < 0 ) { | |
return right == null ? false : right.contains(element); | |
} | |
else { | |
return left == null ? false : left.contains(element); | |
} | |
} |
The
toList
method does an in-order traversal of the tree to return an ordered list:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@Override | |
public List<T> toList() { | |
final List<T> asList = new LinkedList<T>(); | |
if ( left != null ) { asList.addAll(left.toList()); } | |
asList.add(value); | |
if ( right != null ) { asList.addAll(right.toList()); } | |
return asList; | |
} |
For debugging purposes, and to visualize the tree depth, I decided to override
toString
:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@Override | |
public String toString() { | |
return innerToString(""); | |
} | |
private String innerToString( String prefix ) { | |
StringBuilder builder = new StringBuilder(); | |
if ( left != null ) { | |
builder.append( left.innerToString(prefix + " ")); | |
} | |
builder.append(prefix).append(value).append("\n"); | |
if ( right != null ) { | |
builder.append( right.innerToString(prefix + " ")); | |
} | |
return builder.toString(); | |
} |
Finally, here is a runner I used to compare the correctness of my implementation for addition/removal against the standard mutable
TreeSet
from the Java standard library:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import java.util.Iterator; | |
import java.util.List; | |
import java.util.Random; | |
import java.util.Set; | |
import java.util.TreeSet; | |
public final class ImmutableTreeRunner { | |
private static final Random RANDOM = new Random(); | |
/** | |
* @param args | |
*/ | |
public static void main(String[] args) { | |
final Integer firstValue = Integer.valueOf( RANDOM.nextInt(1000)); | |
ImmutableSet<Integer> immutableSet = new ImmutableBinaryTree<Integer>(null, null, firstValue); | |
final Set<Integer> mutableSet = new TreeSet<Integer>(); | |
mutableSet.add(firstValue); | |
// Add 100 random integers in the [0,1000) range | |
for ( int i = 0; i < 100; i++ ) | |
{ | |
final int value = RANDOM.nextInt(1000); | |
mutableSet.add(value); | |
immutableSet = immutableSet.add(value); | |
} | |
List<Integer> immutableValues = immutableSet.toList(); | |
System.out.println( immutableValues.toString() ); | |
System.out.println( mutableSet.toString() ); | |
// Remove 100 random integers in the [0,1000) range | |
for ( int i = 0; i < 100; i++ ) | |
{ | |
final int value = RANDOM.nextInt(1000); | |
mutableSet.remove( value ); | |
immutableSet = immutableSet.remove(value); | |
} | |
immutableValues = immutableSet.toList(); | |
assert( immutableValues.size() == mutableSet.size() ); | |
Iterator<Integer> immutableValueIter = immutableValues.iterator(); | |
Iterator<Integer> mutableValueIter = mutableSet.iterator(); | |
System.out.println( immutableValues.toString() ); | |
System.out.println( mutableSet.toString() ); | |
while ( true ) { | |
assert( immutableValueIter.hasNext() == mutableValueIter.hasNext() ); | |
if ( immutableValueIter.hasNext() != mutableValueIter.hasNext() ) throw new RuntimeException("hasNext doesn't match"); | |
if ( !immutableValueIter.hasNext() ) { break; } | |
final Integer immutableVal = immutableValueIter.next(); | |
final Integer mutableVal = mutableValueIter.next(); | |
assert( immutableVal.equals(mutableVal)); | |
} | |
System.out.println(immutableSet.toString()); | |
} | |
} |
There are several problems with this implementation that I plan to address in a future post:
- The remove method creates a new node at each level, even if the element to remove does not exist.
- The use of
null
to indicate the lack of a left or right child opens us up to the potential ofNullPointerException
s and wastes space from those fields. (In particular, in a balanced binary tree, half the elements will be leaf nodes.) - Currently, there is no way to create an empty tree, since every node has a value. If you look at the
ImmutableTreeRunner
implementation above, I had to generate my first random value before I could create the ImmutableBinaryTree.
Hey there. I've been playing with immutable trees as well, and there's a decent solution to the Empty Tree problem if you model your tree as a few related classes:
ReplyDelete1: An Abstract Tree (or interface)
2: A Branch (representing a node with a value)
3: A Leaf (representing a node without a value)
An empty tree is a Leaf. Adding a value to a Leaf creates a new Branch with that value, and two new Leafs for Left/Right.
Adding to a branch always delegates to the Add method of the Left or Right child (which will eventually be a Leaf, which will do the actual creation of a new Branch node).
Hope that's somewhat useful. (Also, I appear to be raising the dead with this comment. Oops)
Hi Kevin,
ReplyDeleteThe branch/empty approach is far better than the use of nulls I described in this post (and it's consistent with e.g. the binary tree type described in Real World Haskell (find in page for "binary tree")), and is more practical than the approach I took in my second post on immutable binary trees.
That said, I still kind of like that approach (with concrete classes for DualBranch, LeftBranch, RightBranch, Leaf, and Empty), since every member variable is essential (that is, you never have a null or empty member), and the lack of one or both children is given by the class reference (which is mandatory in Java). A tree consisting purely of left branches in that approach ends up using as much memory as a linked list, I believe.
If I were implementing this for the real world, I think I would take the approach you describe, though.
Thanks for reading!