Section 11.5 Testing schools of thought
When it comes to using mocks in tests, there are fundamentally two schools of thought, one exemplified by using mocks and the other by avoiding them.
- The behavior-prescribing school of thought focuses on testing that a method behaved in the expected way. It does so by using spies and mocks to enforce that behavior.
- The behavior-agnostic school of thought focuses on the effect of the method call on the system, simply querying the state of the system after the call is completed. They focus on the effect of the method call, and not how it carried out its work.
Our original tests for
Summary
are a good example of the behavior-agnostic school of thought. They were giving the Summary
class an appropriate list of grades, via an iterator, they did not care how it used that list of grades, and simply inspected the final string.Our newer tests moved more towards the behavior-prescribing school of thought. They spied on the iterator provided to the
Summary
class, and expected that iterator to be called in a specific way.So which way is better? I am not sure there is a clear answer here, each approach has its merits. For example, our new tests for the
Summary
class would fail if that class chose to somehow go through the list twice (not easy to do currently because we only gave it an iterator and that can only be used once, but think of it as a thought experiment). Should our test care if the Summary
class chooses to do more work than it needs to? Do we make our tests more fragile by making them expect very specific things? Will they break when we try to change something in the Summary
class?On the other hand, remember why we chose to use mocks in the first place. They helped us decouple the
Summary
class from the GradeReader
class, so that any test failure has to be due to a fault in the Summary
class, as opposed to a fault in the GradeReader
class which spread to the Summary
class like wildfire because they had been built too close to each other.This is then our tradeoff:
- Behavior-agnostic tests end up depending on the correct behavior of other modules and therefore make it harder to identify failure points. Our tests are more likely to break for reasons that they are not responsible for testing.
- Behavior-prescribing tests end up being closely tied to the specific implementation of the method, by prescribing steps that the function is required to take. Our tests are more likely to break during refactoring efforts or in any attempt to change how our function behaves.
I’d like to think that the answer is somewhere in the middle:
Use stubs to provide the needed inputs to the method under test, decoupling it from other classes. But only spies and mocks to prescribe the behavior that is an absolutely integral part of what this method must do.Depending on their role in the overall system certain classes and methods might need one of those approaches more than the other.
For example if a method is supposed to print something to the standard output, then we should spy on that standard output to ensure that this is indeed what the method is doing. That is an essential behavior-prescribing part of the test. We don’t want the function to print something twice, for example. So what are our choices? We can capture the standard output of the program, redirect it to a string then check that it is the correct string, or we can mock the process instead, and find out how the method called the standard output.
And with this example we arrive at one of the problems. Is our method going to use
System.out.println
, System.out.format
, or any of another dozen methods used to write output? Do we want to really prescribe to it which of those it uses? Probably not. When there are multiple ways of accomplishing something, a behavior-prescribing approach can be limiting.