Recently I have had the benefit of working with a really good tester on an agile project. This person has consistently challenged both my terminology and approach to testing. After working with him I have come to a completely new conceptual model of automated testing (checking) vs manual (and exploratory) testing, how they are very different and how they complement each other. This is the story of what I have learnt.
A Developer Model of Testing
I've been writing automated tests as part of my development process for as long as I can remember. These used to be tests written after the code to check that it worked, then I started doing TDD and using the tests to drive the design while validating that it also worked as expected. Recently I've also added BDD into the mix as a way of driving and testing requirements. However, all this time my mental model of testing has been somewhat fixed and, as I now realise, focused solely from a development perspective. Let me try to describe this view.
As a developer, my goal is simple, write some code such that:
- given a know starting state and;
- applying a known sequence of operations;
- with a defined set of input values;
- then I get an expected and valid output state
My typical approach would be to write tests that:
- Set up some fixture code representing a state that I want to start from
- Define a test that applies an operation with valid input values
- Define some assertions that should be true about the state after the operation has completed
- Implement and refactor the code until the assertions pass
I'd then go on to create some additional variations of the above test with alternative applications of the operation, adjusting the implementation code so that each test's assertions pass. Typically the alternative tests would cover a mix of:
- Apply an operation with different valid input values
- Apply an operation with invalid input values
- Apply an operation with boundary input values
- Apply an operation with non-supplied input values
- Apply an operation with potentially harmful input values
- Apply an operation forcing different failure cases in underlying code, subsystems or boundaries
- Apply operations involving concurrency or synchronisation
As a conscientious developer I'd be trying to write automated tests that cover the widest range of scenarios so that I could be confident my code would work as expected in all situations. When describing my progress in a stand-up I'd be using phrases such as:
- "The code is done and is working because all the tests pass"
- "The tests show we've fixed all the defects"
- "We've made the change and the tests show that we haven't broken anything"
From a developer point of view the functionality would then be handed over to a QA team member to check and perhaps for them to write some higher level test against to ensure the code meets the functional requirement (i.e. given a good set of data and a good set of inputs then the correct result is observed). Code complete, story done, into production, next please!
The Awkward Tester
The above was pretty much my approach to testing and is pretty typical of that taken by most developers I think. However, when working with this excellent tester on our team he started to question the statements that we made about our code. For example:
- The code is done and working because all the tests pass: "Your tests show that your code makes the tests pass, but they don't prove that it works"
- The tests show we've fixed all the defects: "The tests actually just show that you've fixed the specific identified defect case(s), all the other unknown ones are still there"
- We've made the change and the tests show that we haven't broken anything: "No, the tests show that you haven't broken the paths covered by the tests, there's probably other things that are broken though"
My initial reaction was to just laugh this off, put the tester down as an awkward, pedantic sod and carry on as before. However, based on his well deserved reputation and previous work record, I trusted and respected the skills of this particular person. This sowed some seeds of doubt in my mind: was it just the language us developers were using that he had problems with, or was he really talking about some fundamental wider view of testing that we, as developers, were all missing?
My New Developer Model of Testing
After some serious contemplation, introspection (and a number of very long lunchtime walks) I have come to realise that what I have been considering as 'comprehensive' automated testing is really just a small part of of a much wider testing world. By focusing so much on these automated test suites I had become blind to the wider implications of my code quality. Our excellent tester friend calls what we do automated 'checks', not tests, and I think he might be right! What we are building are pieces of non-deployed code that check that our real code does what we are expecting it to do in a certain set of scenarios. Testing is so much more. Let me explain my current thinking...
In automated checks we first set up fixtures that represents known starting states for our application, component or class. For all but the most trivial examples these states will only be a small subset of all of the possible states that my application could be in. My checks then pick an operation or set of operations (i.e. feature) to evaluate against this state. Again, for anything but the most trivial single function cases the operations we pick and the order they are called will represent just a small subset of all possible operations and combinations of those operations.
Next we look at input values and their combinations and the same subset rules apply. This is also true for failure scenarios that we might check. Finally, we assert that our code leaves the application in a consistent state, with those states again being just a small subset of the possible states my application could be in. We then clear down our test fixtures and move onto the next test. This can be clearly shown in the following diagram:
Now, our tester has a different view. He is trying to identify possible places where my application might break or exhibit a defect. He is therefore using his skills, knowledge and experience to pick combinations of specific states, operations, input values and failure scenarios that are both likely to occur and most likely to find the defects in the application. These may be inside the subsets used by the automated checks, but can also be outside, from the wider set of possibilities.
Additionally, the tester has another weapon in their arsenal in that they can use the output state of one set of tests as the input state into the next - something that it's not so easy to do in automated tests that might be run in a different order or individually. We can see the tester's approach to testing in the following diagram:
It is clear to me now that the two approaches are both necessary and complementary. Checking has far less value if testing is not also applied. Testing would also be far more expansive and time consuming if checking was not in place to find and eliminate a certain percentage of defects at development time.
The Role of the Tester
So, what is the role of the tester? Most developers would tend to assume that the tester's job is to prove that their code works (and I will even admit that this was probably the way I used to think). Go on admit it developers, how many times have we cursed the tester under our breath when they have found a critical flaw in out code a few days/hours before it's due to be released? Come on, we know we've all done it!
Turns out we were wrong! The role of a tester is to prove that our code doesn't work and to prove it before the code gets into production and is broken by a real user or customer. Why should a tester be trying to prove that my code works? Isn't that why I've invested so much time and effort into building my automated suite of checks - to give me confidence that my code works under the situations that I have identified. What I really want to to find out is where my code doesn't work so that I can identify the omissions from my suite of checks and fix the broken code paths. This is what a good tester should be doing for me.
First and foremost a good tester should be using his skills to find scenarios whereby my code fails and leaves the application in a broken and inconsistent state. This should be the Holy Grail for a tester and they should be rewarded for finding these problems. Who wants a production outage caused by a defect in their code? Not me, that's for sure. Next they should be finding cases where the application fails non-cleanly but is not broken and can continue to function. If they can do either of these things then they are worth their weight in salary/day rate! A tester that can only find the places where the code fails cleanly or does not fail at all is really not of much value: my automated checks already proved this! I know I will certainly be valuing my testers much more than I used to. And you should too.
So how as developers do we aid out able testers in finding the defects that are in our code? We have to help them understand how the bits of the system fit together; we have to explain the approaches we have used to determine our automated checking scenarios; we have to help them understand where the failure points might be and; we have make sure they know the expected usage patterns that we have coded to. Given all this information or excellent tester should be able to find the gaps, the failure scenarios and the unexpected usages that expose the defects that our automated checks do not. When they find them we should be glad, and keen to find out how they managed to get our lovingly created code to break. We then fix the problem, add checks for those paths that we missed and learn the lessons so that we don't repeat the same defect creating pattern in any future code we write.
The Stand-up Revisited
So, given this new understanding of testing, testers and automated checks, what should the progress descriptions in the stand-up really be:
- "The code implementation is complete and my checks for the main scenarios all pass. Mr Tester, can you find the defects please?"
- "I've fixed the specific path(s) which I identified as being a cause of the reported defect and added automated check. We need to test to find the other paths that might be causing the same problem."
- "The existing checks all run against the new code. We need to test to see which other paths or aspects of features have been broken by these changes."
See the difference? No assumptions that just because an automated suite of checks has completed successfully that the code either works, is free of defects or that its addition has not broken the application somewhere. The automated checks can reduce the probability of bugs or issues escaping the development process and can give me more confidence in my code. However, some defects and problems are still likely to be present and my application will still break under certain conditions that my checks don't cover. I now say: "...over to you Mr. Tester I'll explain what I've done and you can help me find where it is broken!".