So, you want to write software?: Selecting The Correct Test Scenarios and Data

Yesterday I happened to be reviewing some integration tests with my excellent tester colleague. These tests were defined using a set of highly simplified synthetic data selected for the purpose of making the tests pass. We both agreed that tests should actually be written using a mixture of realistic and synthetic data selected for the purpose of trying to make the tests fail! As developers we are often too keen to write tests that prove that our code is good rather than writing tests to find the problems in our code.

This exercise bought to mind a situation I had a few years ago with some code provided by an offshore development team. It was a very simple piece of code and came with some tests to show that it worked. However, it exhibited a single fundamental flaw that the synthetic data designed to make the tests pass would never have found. I've recalled the problem below (somewhat simplified), and I've update the code from Java/JUnit to Scala and Cucumber (because quite frankly I prefer these languages!).

The Requirement

The specific requirement for the system being built was that it would collect a user's date of birth during the sign-in process. If the user was under 17 then they would not be allowed to use the system. If they were 17 or over then they would be taken to the main menu. The particular piece of code in question was the age validation algorithm.

The Supplied Solution

The initial solution to this problem consisted of the following piece of code:

class AgeValidation {
  private val AuthorisedAge = 17

  def isAuthorisedAge(user: User) = {
    val cal = Calendar.getInstance
    val currentYear = cal.get(Calendar.YEAR);

    cal.setTime(user.dateOfBirth)
    val birthYear = cal.get(Calendar.YEAR);

    (currentYear - birthYear) >= AuthorisedAge
  }
}

Tests were supplied that covered two specific scenarios:

Feature: The application should ensure only users over 17 are allowed access
    As a Application Security
    I want only users aged over 17 to access the system

    Scenario: User aged over 17 can access the system
        Given a user aged over 17
        When the age check is applied
        Then they are authorised to continue

    Scenario: User aged under 17 cannot access the system
        Given a user aged under 17
        When the age check is applied
        Then they are not authorised to continue

And the implementation of these tests was as follows:

class AgeValidationSteps extends ScalaDsl with EN with ShouldMatchers {
  private var user: User = _
  private var valid: Boolean = _
  private val dateFormatter = new SimpleDateFormat("dd/MM/yyyy");

  Given("""^a user aged over 17$""") {
    user = new User("test", dateFormatter.parse("01/01/1970"))
  }

  Given("""^a user aged under 17$""") {
    user = new User("test", dateFormatter.parse("01/01/2010"))
  }

  When("""^the age check is applied$""") {
    val validator = new AgeValidation
    valid = validator.isAuthorisedAge(user)
  }

  Then("""^they are authorised to continue$""") {
    valid should be (true)
  }

  Then("""^they are not authorised to continue$""") {
    valid should be (false)
  }
}

Extra credit if you can spot the problem with this implementation and why the synthetic test data would have never found it before you read on!

The Problem With This Code

Did you spot it? Here's a hint: what if my 17th birthday is not until tomorrow? Yes, that's right, the code has a fundamental defect in that anyone who is 17 in the current year but who has not yet had their birthday is granted access to the system even though they are still 16. And yes, this was the actual algorithm submitted to us by the offshore developers!

Quite clearly, the synthetic test data used in the supplied tests was selected to ensure the tests passed, but not to find the problems in the code. Let's correct the feature spec to base it on values that actually might find something:

Feature: The application should ensure only users over 17 are allowed access
    As a Application Security
    I want only users aged over 17 to access the system

    Scenario: User who was 17 last year can access the system
        Given a user whose 17th birthday was last year
        When the age check is applied
        Then they are authorised to continue

    Scenario: User who is 17 next year cannot access the system
        Given a user whose 17th birthday is next year
        When the age check is applied
        Then they are not authorised to continue

    Scenario: User who was 17 yesterday can access the system
        Given a user whose 17th birthday was yesterday
        When the age check is applied
        Then they are authorised to continue

    Scenario: User who is 17 today can access the system
        Given a user whose 17th birthday is today
        When the age check is applied
        Then they are authorised to continue

    Scenario: User who is 17 tomorrow cannot access the system
        Given a user whose 17th birthday is tomorrow
        When the age check is applied
        Then they are not authorised to continue

And the associated test steps:

class AgeValidationSteps extends ScalaDsl with EN with ShouldMatchers {

  private var user: User = _
  private var valid: Boolean = _
  private val dateFormatter = new SimpleDateFormat("dd/MM/yyyy");

  Given("""^a user whose 17th birthday was last year$""") {
    val cal = Calendar.getInstance
    cal.add(Calendar.YEAR, -18)
    user = new User("test", cal.getTime)
  }

  Given("""^a user whose 17th birthday is next year$""") {
    val cal = Calendar.getInstance
    cal.add(Calendar.YEAR, -16)
    user = new User("test", cal.getTime)
  }

  Given("""^a user whose 17th birthday was yesterday$""") {
    val cal = Calendar.getInstance
    cal.add(Calendar.YEAR, -17)
    cal.add(Calendar.DATE, -1)
    user = new User("test", cal.getTime)
  }

  Given("""^a user whose 17th birthday is today$""") {
    val cal = Calendar.getInstance
    cal.add(Calendar.YEAR, -17)
    user = new User("test", cal.getTime)
  }

  Given("""^a user whose 17th birthday is tomorrow$""") {
    val cal = Calendar.getInstance
    cal.add(Calendar.YEAR, -17)
    cal.add(Calendar.DATE, 1)
    user = new User("test", cal.getTime)
  }

  When("""^the age check is applied$""") {
    val validator = new AgeValidation
    valid = validator.isAuthorisedAge(user)
  }

  Then("""^they are authorised to continue$""") {
    valid should be (true)
  }

  Then("""^they are not authorised to continue$""") {
    valid should be (false)
  }
}

Run this set of tests against the existing code and the final scenario 'User who is 17 tomorrow cannot access the system' fails. Our tests have actually done something useful: they have proved that our code doesn't work. Test data selected with the specific goal of trying to find problems with the code has done its job. I'll leave it as an exercise for the reader to fix the implementation so that it all works correctly!

Further Improvements

Even with our improved set of test data there are still some significant improvements that can be done to these tests by selecting some additional scenarios and test data values that might find problems:

Testing the algorithm works with data boundary cases such as first/last day of the year, leap years and so on
What happens when the user enters a date of birth in the future, does the code still work?
What about if the person is over 100? Are we dealing with 4 digit or 2 digit years?
Is this an international system? If so, then what about timezones? A person in a timezone east of us may already by 17, but may not be 17 in our timezone.

Apply The Same Rules To Our Tests

We can also apply the same thinking to our tests. For example, if we run our improved tests against the original code but do so on the first day of the year then they will all pass. The reason being that subtracting one day from the current date would push the birth date into the previous year. We therefore need to think not only about how our synthetic test data might find problems in our code by also how they may cause problems in our tests. In this case, solving this problem is just a case of improving our code to allow our tests to define the current date rather than using whatever the current system date is:

class AgeValidation(baseDate: () => Date) {
  ...
}
object AgeValidation {
  def apply = new AgeValidation(() => new Date)
  def apply(baseDate: String) = new AgeValidation(() => new SimpleDateFormat("dd/MM/yyyy").parse(baseDate))
}

// And in the tests
val validator = AgeValidation("15/07/2010")

Our test code now has the ability to control exactly what current date is used as a basis for the comparison, so it can experiment with a wider range of possible scenarios.

Does This Change Our TDD Approach

All of the above got me thinking about how this impacts on the Test-Driven Development approach. In this model we first write a test to validate the expected behaviour of our code. We then write minimal code so that the test fails. Next we write the code to make the test pass. Finally we refactor to make the code better without breaking the tests. Most developers will approach this by picking some synthetic values that are designed to prove that the 'happy path' through the code works when they have written it. Perhaps we should be focusing more on TDD where the tests we start out writing are for the scenarios that are more likely to identify problems - that way the code we write to make these pass has to deal with these cases. Certainly something I will be experimenting with in my next TDD session.

2 comments:

Anonymous10 February 2011 at 10:29
Hi Chris,

you've certainly hit a grey area here and there are few black and white answers.

One thing to bear in mind is that we are developers and not testers and vice versa.

What I mean by this is that I do not think at the TDD level you should be writing lots of tests that a tester would to try and break the code. Thats there job at a higher-level.

I believe TDD is not about testing our code as much as it is about designing our code. The tests act as regression and are a side benefit.

I use my tests to drive the design and spur me to write production code to pass the tests.

I would cover the most obvious boundary cases that I deem important. Pathological testing I would not do.
Unknown20 February 2011 at 15:07
This post made me smile, since I worked with you on this project and it brought back some painful memories! I think this code was all in a single 3000 line class too.

Thursday, 10 February 2011

Selecting The Correct Test Scenarios and Data