Section 9.4 More tests
Subsection 9.4.1 Test 3 insert left
We’re going to start picking up the pace now, I will provide less details on my thought process. I hope the above two tests provided enough context. So next up I want to write a test that would place an element left of the root. In order to do that I will need to start from a tree that has a root in place, and I can use my
treeWith
helper method for that.But before I start, taking another look at this method, I don’t see any reason why this isn’t a method provided by the
BSTree
class. Therefore I will move it there. First I make it static, then I move it to the BSTree
class, then rename it to with
, as it will be called as BSTree.with(...)
so the tree
part is superfluous. Next, I parametrize it on a generic parameter type T
instead of Integer
.Looking at the method more, I think the
for
loop part should be an instance method of the tree
class, perhaps called insertAll
, so I will extract it as such: First I extract it as a static method with parameters tree
and elements
, then I turn it into an instance method of the tree
parameter, and end up with these two methods:static <T> BSTree<T> with(T... elements) {
BSTree<T> tree = empty();
tree.insertAll(elements);
return tree;
}
public void insertAll(T[] elements) {
for (T element : elements) insert(element);
}
Ok now I am ready to write my next test. I’ll do some wishful thinking again, writing what I would like to see:
@Test
public void insertAtLeftOfRoot() {
BSTree<Integer> tree = BSTree.with(5);
tree.insert(2);
assertEquals(
node(5,
node(2),
empty()),
tree.getRoot());
}
Note the
node(...)
part. I thought it would be a nice way to write a node with a value followed by the left and right children. But my current node
helper doesn’t take left
and right
parameters. And come to think of it, my BSTreeNode
class doesn’t have left
and right
fields either! Hm, so I better write a test that will force me to do that. This test doesn’t look very pretty, but it directly checks that the resulting tree has the correct structure. We’ll do better later.Write test code that will force you to write the production code you want.
public void insertAtLeftOfRoot() {
BSTree<Integer> tree = BSTree.with(5);
tree.insert(2);
assertEquals((Integer) 5, tree.getRoot().value);
assertEquals((Integer) 2, tree.getRoot().left.value);
assertEquals(empty(), tree.getRoot().right);
}
So this will force me to create
left
and right
fields of a node. We’ll decide later if we should expose that that way. Maybe that test will change later. But for now this will force me to add to BSTreeNode
the two fields:BSTreeNode<T> left;
BSTreeNode<T> right;
Now my test runs but fails because the current code actually puts every new value at the root. So now I am forced to do something there. I will add a simple
null
check:void insert(T element) {
if (root == null) root = BSTreeNode.node(element);
else root.left = BSTreeNode.node(element);
}
I will later need to add some checks in that
if
clause and also deal with the element having to go even deeper, but for now this will work and my tests are not forcing me to be smarter.Good, I am passing my test. It’s time to do some cleanup. I really would like to be able to write the test the way I wanted to originally, let’s see if I can make that happen now. I will create a
node
method in BSTreeNode
that takes three parameters, value-left-right. Then I will update my constructor to take left
and right
. So at the end of the day we’ll have the following:static <T> BSTreeNode<T> empty() { return null; }
static <T> BSTreeNode<T> node(T value) {
return new BSTreeNode<>(value, null, null);
}
static <T> BSTreeNode<T> node(T value, BSTreeNode<T> left, BSTreeNode<T> right) {
return new BSTreeNode<>(value, left, right);
}
private BSTreeNode(T value, BSTreeNode<T> left, BSTreeNode<T> right) {
this.value = value;
this.left = left;
this.right = right;
}
I need to also update my
equals
and hashCode
methods to use the new values:public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
BSTreeNode<?> that = (BSTreeNode<?>) o;
return Objects.equals(value, that.value) &&
Objects.equals(left, that.left) &&
Objects.equals(right, that.right);
}
public int hashCode() {
return Objects.hash(value, left, right);
}
I should ideally have some tests for this
equals
method, and now that we have the node
helper method it would be easy to write some tests. I’ll do that real quick:@Test
public void equalsForNodesWorks() {
assertEquals(node(5), node(5));
assertNotEquals(node(5), node(3));
assertEquals(node(5, node(3), node(7)),
node(5, node(3), node(7)));
assertNotEquals(node(5, node(2), node(7)),
node(5, node(3), node(7)));
assertNotEquals(node(5, node(3), node(8)),
node(5, node(3), node(7)));
}
That gives me a bit more confidence in my equals method. Now I can write my test in the more standard way.
@Test
public void insertAtLeftOfRoot() {
BSTree<Integer> tree = BSTree.with(5);
tree.insert(2);
assertEquals(node(5, node(2), empty()),
tree.getRoot());
}
I want to do a bit of cleanup in my tests. I notice that I always have to get the root of the tree then compare it to a node. It sounds like that could be a method. To make that happen I will take the following steps on the above test:
- Extract the
node(...)
part into a local variable calledexpected
. -
Perform Extract Method on the
assertEquals
line, withexpected
andtree
as the parameters, to anassertRootEquals
method.private <T> void assertRootEquals(BSTree<T> tree, BSTreeNode<T> expected) { assertEquals(expected, tree.getRoot()); }
- Inline the
expected
variable now that it did its job (to force the Extract Method mechanism to recognize it as a variable). - Coincidentally, I can now also simplify the
assertInsertingOneMakesRootNode
method, as well as theemptyTreeHasNullNode
test assertion.
This is how my test looks like:
@Test
public void insertAtLeftOfRoot() {
BSTree<Integer> tree = BSTree.with(5);
tree.insert(2);
assertRootEquals(tree, node(5, node(2), empty()));
}
Turning my attention to my
BSTree.insert
code, I notice the repetition of BSTreeNode.node(element)
in both branches. I recognize that I will end up repeating that a lot: no matter what, this node will be created. I will therefore extract it as a local variable before I do any of the remaining work of the function, to a variable called just node
. Then I think I will extract the rest of that method into another method, private for now, called insertNode
:void insert(T element) {
BSTreeNode<T> node = BSTreeNode.node(element);
insertNode(node);
}
private void insertNode(BSTreeNode<T> node) {
if (root == null) root = node;
else root.left = node;
}
Next I will inline the
node
variable in the insert
method, since it now has only one occurrence. And now we can focus on working with this second method, insertNode
. And we are ready for the next test.Subsection 9.4.2 Test 4 insert right
This test is easy to write, I’ll just more or less copy my left test:
@Test
public void insertAtRightOfRoot() {
BSTree<Integer> tree = BSTree.with(5);
tree.insert(7);
assertRootEquals(tree, node(5, empty(), node(7)));
}
To my delight the test is failing. But I notice that it doesn’t show me a useful "difference", just telling me that two
BSTreeNode
objects with weird ids are different. What I need is to have the system "print" me some version of the node, even if it is all in one line. We can do that by overwriting the toString
method.Provide atoString
method that will show you useful diagnostics in a test case.
I will use the following, taking special care to handle the "both children null" case with a simpler printout:
public String toString() {
if (left == empty() && right == empty())
return String.format("leaf(%s)", value);
return String.format("node(%s, %s, %s)", value, left, right);
}
Nodes with no children are called leaves. Now this reminds me, that I still have a
node(value)
static method that makes a leaf. I will rename it to be called leaf
and distinguish it from the one that takes three arguments.Ok and now we need to make our test pass. It fails since I always add to the left right now. I need to differentiate.Here’s what I want to write:
private void insertNode(BSTreeNode<T> node) {
if (root == null) root = node;
else if (node.value < root.value) root.left = node;
else root.right = node;
}
But the compiler won’t let me do that. The reason is that the compiler does not know what "less than" means for objects of some arbitrary class
T
. And that makes sense: If we are going to insert elements in a binary search tree, we need to be able to compare them.Luckily for us Java provides an type that does that. We can specify that our variable
T
must be of that type. We do this changing out class definition to:public class BSTree<T extends Comparable<T>> {
We’ll talk about the
Comparable
type in a moment. For now, note that the compiler will ask us to fix the various static methods as well, specifying the same thing for their type in the front.static <S extends Comparable<S>> BSTree<S> empty() {
return new BSTree<>();
}
Now we come to [Comparable](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/Comparable.html). The gist of
Comparable
is a method compareTo
which returns an integer comparison value. If o1.compareTo(o2)
is $0$ then the two elements are considered equal, if it is negative then o1
is considered less than o2
and if it is positive then o1
is considered bigger than o2
.We can therefore use this method in our check, as follows:
private void insertNode(BSTreeNode<T> node) {
if (root == null) root = node;
else if (node.value.compareTo(root.value) < 0) root.left = node;
else root.right = node;
}
And now our tests should be passing! Time for cleanup. That comparison is really hideous. I wish that the
Comparable
definition included some helper methods like isLessThan
defined in terms of compareTo
, but it does not. So I will extract a method of our own. It won’t look as nice as it could have been but it will get the job done.private void insertNode(BSTreeNode<T> node) {
if (root == null) root = node;
else if (isLessThan(node.value, root.value)) root.left = node;
else root.right = node;
}
private boolean isLessThan(T v1, T v2) {
return v1.compareTo(v2) < 0;
}
I don’t see anything else to do, so I will move to the next test.
Subsection 9.4.3 Test 5 insert left-left
Now we need to consider going deeper into the tree. Let’s write a test for inserting a left child of the left child of the root:
@Test
public void insertAtLeftLeftOfRoot() {
BSTree<Integer> tree = BSTree.with(5);
tree.insert(2);
tree.insert(1);
assertRootEquals(tree,
node(5,
node(2, leaf(1), empty()),
empty()));
}
This test will fail, as now the insert operation keepsreplacing the left child of the root. But it’s a start, we have a failing test. Now to make it pass. It is clear that I need to do more work with the "less than" case, so I will start by extracting it into a helper method:
private void insertNode(BSTreeNode<T> node) {
if (root == null) root = node;
else if (isLessThan(node.value, root.value)) insertLeft(root, node);
else root.right = node;
}
private void insertLeft(BSTreeNode<T> parent, BSTreeNode<T> node) {
parent.left = node;
}
So now I can focus on fixing this
insertLeft
method. Sounds like I need to do the exact same kind of thing as I did for insertNode
to begin with: If the left is null
we do one thing, otherwise we do another:private void insertLeft(BSTreeNode<T> parent, BSTreeNode<T> node) {
if (parent.left == null) parent.left = node;
else parent.left.left = node;
}
Good, this makes our tests pass! Hm but what if I had to go even further left to get to the right spot? And what about the similarities between my two code blocks? It sort of feels likeI’m doing the same sorts of checks that I did for the
root
node, and it will be very similar when I look at insertRight
, and that feels like needless repetition that we should be able to avoid. We’ll need to put our thinking caps on. But let’s push through and get these basic sets of code going, then we’ll look at the code duplication.Subsection 9.4.4 Test 6 insert left-right, right-left, right-right
I’ll attack two features together as they seem related. We’ll start with inserting left-right:
@Test
public void insertAtLeftRightOfRoot() {
BSTree<Integer> tree = BSTree.with(5);
tree.insert(2);
tree.insert(3);
assertRootEquals(tree,
node(5,
node(2, empty(), leaf(3)),
empty()));
}
Ok it’s failing, because our
insertLeft
method does not differentiate between left and right, and always puts in left. We need to change it so that it checks:private void insertLeft(BSTreeNode<T> parent, BSTreeNode<T> node) {
if (parent.left == null) parent.left = node;
else if (isLessThan(node.value, parent.left.value)) parent.left.left = node;
else parent.left.right = node;
}
Now let’s set the same up for the right side. But before I do so, I find my tests a bit repetitive. I can change them by using the fact that
with
takes multiple elements and inserts them in that order. I should have probably done this earlier:@Test
public void insertAtLeftRightOfRoot() {
assertRootEquals(BSTree.with(5, 2, 3),
node(5,
node(2, empty(), leaf(3)),
empty()));
}
And similarly for the other tests. This way I could literally put many of these asserts in the same test if that were to make sense.
Ok now time for right-right and right-left. I’ll write them both at once:
@Test
public void insertAtRightRightOrRightLeft() {
assertRootEquals(BSTree.with(5, 7, 8),
node(5,
empty(),
node(7, empty(), leaf(8))));
assertRootEquals(BSTree.with(5, 7, 6),
node(5,
empty(),
node(7, leaf(6), empty())));
}
And for the code I basically copy over the
insertLeft
method, a sure sign that I should be able to do some nice refactoring, after all inserting on the left and inserting on the right are fundamentally the same thing once you are there.private void insertNode(BSTreeNode<T> node) {
if (root == null) root = node;
else if (isLessThan(node.value, root.value)) insertLeft(root, node);
else insertRight(root, node); // <--- new stuff
}
private void insertRight(BSTreeNode<T> parent, BSTreeNode<T> node) {
if (parent.right == null) parent.right = node;
else if (isLessThan(node.value, parent.right.value))
parent.right.left = node;
else parent.right.right = node;
}
This passes all my tests, and is a good sign. But the code duplication is ridiculous, and it also won’t go down to a 4th level. Let’s get that in first, the ability to work with any level.
Subsection 9.4.5 Test 7 insert left-left-left etc
@Test
public void insertMoreLevels() {
assertRootEquals(BSTree.with(5, 4, 3, 2),
node(5,
node(4,
node(3,
leaf(2),
empty()),
empty()),
empty()));
assertRootEquals(BSTree.with(5, 6, 7, 8),
node(5,
empty(),
node(6,
empty(),
node(7,
empty(),
leaf(8)))));
}
I hope by now you are getting comfortable reading these node-chains, if not you should work it out until it becomes easy to digest.
Alright, we have failing tests, now we need to figure out how to fix them. Looking at my code, it sounds like when I set the value of
parent.left.left
for example, it would only be the wrong thing to do if that value was null
before. And if I add a check there, well it sounds like I am redoing the work that insertLeft
is already doing! Doh, that’s the answer, recursion! we’ll just call ourselves, telling ourselves to insert the node somewhere to the left of parent.left
. Let’s see if that works:private void insertLeft(BSTreeNode<T> parent, BSTreeNode<T> node) {
if (parent.left == null) parent.left = node;
else if (isLessThan(node.value, parent.left.value))
insertLeft(parent.left, node);
else insertRight(parent.left, node);
}
private void insertRight(BSTreeNode<T> parent, BSTreeNode<T> node) {
if (parent.right == null) parent.right = node;
else if (isLessThan(node.value, parent.right.value))
insertLeft(parent.right, node);
else insertRight(parent.right, node);
}
So we replaced each
xx.left = node
with insertLeft(xx, node)
and similarly for any xx.right = node
. And this passes our test, and gets us closer to our answer. Still some cleanup left to do.Looking at our two functions, it seems the code from the first
else
and on is identical except for using parent.left
or parent.right
. Let’s extract that into a method, hm, let’s call it insertUnderParent
because it tries to insert the node somewhere below a parent:private void insertUnderParent(BSTreeNode<T> parent, BSTreeNode<T> node) {
if (isLessThan(node.value, parent.value)) insertLeft(parent, node);
else insertRight(parent, node);
}
We can now reuse this method in our three other
insert
methods:private void insertNode(BSTreeNode<T> node) {
if (root == null) root = node;
else insertUnderParent(root, node);
}
private void insertLeft(BSTreeNode<T> parent, BSTreeNode<T> node) {
if (parent.left == null) parent.left = node;
else insertUnderParent(parent.left, node);
}
private void insertRight(BSTreeNode<T> parent, BSTreeNode<T> node) {
if (parent.right == null) parent.right = node;
else insertUnderParent(parent.right, node);
}
Hm much nicer, but boy all these still look very similar. What is going on? Can we uniformize even more? Let’s take a look at what they each say:
- If the root is null, then replace it with the node, else insert the node under the root
- If the parent left is null then replace it with the node, else insert the node under the parent left
- If the parent right is null then replace it with the node, else insert the node under the parent right
It sure would be nice if we could say this only once. But we can’t right now, basically because we do an assignment in the one branch but not in the other. Imagine if our blocks looked like this:
if (xx == null) xx = node
else xx = ...
If that was the case, we could turn it into a single assignment:
xx = "if xx == null then node else ..."
Let’s see if we can pull that off. If I am in the
else
case, then I don’t want to change the thing. So what if my insertUnderParent
method actually returned the parent?private BSTreeNode<T> insertUnderParent(BSTreeNode<T> parent, BSTreeNode<T> node) {
if (isLessThan(node.value, parent.value)) insertLeft(parent, node);
else insertRight(parent, node);
return parent;
}
Then my other methods could become:
private void insertNode(BSTreeNode<T> node) {
if (root == null) root = node;
else root = insertUnderParent(root, node);
}
And similarly for the others. I run my tests, and they still pass! Cool! Now to get the assignment out of the conditional. Recall the ternary operator:
x = a ? b : c;
It assigns to
x
the value b
if a
evaluates to true
, and the value c
if a
evaluates to false
. So I could write my insertNode
method instead as:private void insertNode(BSTreeNode<T> node) {
root = (root == null) ? node : insertUnderParent(root, node);
}
I do that in all three. My tests still pass! Now I can extract the right-hand-side of the assignment as a method, called
insertAtParent
for now as I am running out of names. We’re about to simplify things though. This is how my methods look like now:private void insertNode(BSTreeNode<T> node) {
root = insertWithParent(root, node);
}
private void insertLeft(BSTreeNode<T> parent, BSTreeNode<T> node) {
parent.left = insertWithParent(parent.left, node);
}
private void insertRight(BSTreeNode<T> parent, BSTreeNode<T> node) {
parent.right = insertWithParent(parent.right, node);
}
private BSTreeNode<T> insertWithParent(BSTreeNode<T> parent, BSTreeNode<T> node) {
return parent == null ? node : insertUnderParent(parent, node);
}
private BSTreeNode<T> insertUnderParent(BSTreeNode<T> parent, BSTreeNode<T> node) {
if (isLessThan(node.value, parent.value)) insertLeft(parent, node);
else insertRight(parent, node);
return parent;
}
I think at this point the
insertLeft
and insertRight
methods are so simple, and only used in one place, that I will just inline them. And I will also inline insertNode
, which also does very little right now. And I end up with:void insert(T element) {
root = insertWithParent(root, BSTreeNode.leaf(element));
}
private BSTreeNode<T> insertWithParent(BSTreeNode<T> parent, BSTreeNode<T> node) {
return parent == null ? node : insertUnderParent(parent, node);
}
private BSTreeNode<T> insertUnderParent(BSTreeNode<T> parent, BSTreeNode<T> node) {
if (isLessThan(node.value, parent.value))
parent.left = insertWithParent(parent.left, node);
else parent.right = insertWithParent(parent.right, node);
return parent;
}
I feel there’s still one too many methods here. And the two methods are not clearly differentiated. I will try to inline the
insertUnderParent
method and see if I like the result. First, I change the ternary back into an if-then-else
block, now that it returns something rather than making an assignment.private BSTreeNode<T> insertWithParent(BSTreeNode<T> parent, BSTreeNode<T> node) {
if (parent == null) return node;
return insertUnderParent(parent, node);
}
Now I think inlining
insertUnderParent
will work well:private BSTreeNode<T> insertWithParent(BSTreeNode<T> parent, BSTreeNode<T> node) {
if (parent == null) return node;
if (isLessThan(node.value, parent.value))
parent.left = insertWithParent(parent.left, node);
else parent.right = insertWithParent(parent.right, node);
return parent;
}
Ok not too bad, not great yet but a start. The thing I am realizing now is that I can make this method into an instance method of the
BSTreeNode
class, it no longer needs anything from the tree class as the root
is nowhere to be found. So I will do the Convert to Instance Method refactoring, using the node
variable as the new this
.With an IDE’s automatic refactoring tools I do it as follows: Turn both
insertWithParent
and isLessThan
into static methods, then convert insertWithParent
into an instance method of the BSTreeNode
class.To do it manually instead, I would do the following:
- I copy the method over to the
BSTreeNode
class and make it package-private. I remove thenode
parameter. - I also move the
isLessThan
method there as static. In order to do that I need add<T extends Comparable<T>>
in almost every place inBSTreeNode
where the type is specified (static methods and the class declaration). - Then I make
isLessThan
private again.
Finally, I will rename my method to simply
insertUnder
. Here’s what I will have in the BSTreeNode
class:BSTreeNode<T> insertUnder(BSTreeNode<T> parent) {
if (parent == null) return this;
if (isLessThan(value, parent.value))
parent.left = insertUnder(parent.left);
else parent.right = insertUnder(parent.right);
return parent;
}
And back in
BSTree
our insert
method has become:void insert(T element) {
root = BSTreeNode.leaf(element).insertUnder(root);
}
So each node knows how to insert itself under a parent. If that parent is null then it replaces the parent. Otherwise it figures out if it should be inserted into the left or the right side, and goes in there to do that. Each method returns the node that is to be the new parent, which most of the times is just the old parent, except for the
null
case.I don’t like this very much, I would have preferred to make the parents the
this
, instead of the node that is to be inserted. It would feel more naturally to tell a node "insert this other node appropriately". But as long as we allow our nodes to be null
, we can’t do that.Read that again: The main problem is that we have chosen to represent the absence of a node with the
null
value. What we might want instead is some sort of object that represents this null value but actually has possibly useful methods. These are called "Null objects", and we will discuss this pattern later, after we have talked about interfaces. For now, it is time to end this chapter. You should write a test that builds a bigger tree than the ones we have created so far, and verify that it works.