You may have been there before; you may be there now. You’ve taken a course or tutorial on testing, and you’re ready to test your code as every good programmer does. You open your editor, type in some code, then you get stuck.
“What should I test?”
I’ve been there before; I’ll be there again, probably many more times than I hope to. The question, together with the feeling that I’m not good at this testing thing (which I think is true), is uncomfortable. But work must be done and tested. The trick for resolving the block is in that question. We can turn that question into a process that answers it.
If you find yourself confused about what the test cases should be, it may be a sign that the requirements are insufficient. Insufficient requirements are more of wishes than requirements. They’ll make you scratch your head bald. Some slices of cucumber can help. We’ll come back to cucumber later and you’ll see why it’s good to eat your vegetables
Insufficient requirements are more likely if your requirements are formed as user stories. This doesn’t mean that user stories are bad; their format tells wishes from the users’ perspective, so they’re often declarative, outcome-oriented, and full of unstated assumptions. There, in those assumptions, is where the sweet stuff may be.
Unpack the Assumptions
Assumptions may hold insights into what the requirements are. These assumptions are often unstated in user stories, because the users, well, assume them: they assume that your system is secure, scalable, safe, responsive; they state what they want, and assume that you know how to make it work. The problems come when user stories are handed directly to programmers without refinement. User stories must be discussed and unpacked, their assumptions examined, and the trade-offs considered, for a suitable implementation to be possible.
Consider a fictitious user story for a fictitious video streaming platform called MeTube. MeTube hosts lots of videos: educational videos, workout videos that’ll make you feel bad for eating that pizza (yes, eat your vegetables), boring webinars with poor sound, and long, boring webinars with poor sound. Users often take random breaks within a boring webinar to unfreeze their mind with some funny cat videos. They write the time they stopped watching on a sticky note so that they can resume the videos at that time. It would be nice if MeTube just resumed the videos at whatever point they stopped, so a Product Manager slips the following user story into Jira.
As a user watching a video on MeTube; if I stop watching a video part way in, and I come back to it later, it should resume from the point I watched last.
The user story above has some assumptions. It assumes that the platform can track the last point a user watched in a video, and that it can do this for every user. Hence, it assumes that the platform is scalable. But such tracking can be costly, so some trade-offs may be necessary to fit the platform’s budget. One such trade-off may be to reduce the resolution of the tracking, say by saving save points every 10 seconds rather than every 5 seconds. Your tests would demonstrate that save points are accurate up to 5 seconds. See, the users don’t care, just make it work.
Another assumption of this user story is that the platform is privacy-preserving. In the implementation, the component that tracks save points must know about authentication in order to preserve privacy. Here’s what may happen if that’s not the case. Suppose the save point implementation is exposed over an HTTP API such that the client can fetch the last save point before playing a video. Suppose this API accepts a user ID and video ID pair, and returns the last save point if it exists for that pair. A savvy user can observe this and make requests to know what videos other users have been watching, provided they know the users’ IDs and the video IDs. That would be a terrible violation of privacy.
Of course such an implementation would be silly, but it would be easy to test: send some ID pairs, get the expected responses. But now we know that we must include authentication in the test so that we respect users’ privacy assumptions. The users didn’t tell you this, but stop a user and ask them, “is it okay if other users can see what videos you watched last and your save points just because they’re curious?”, and watch the shadow cross their face as they stare at you in fear and disgust.
What’s with the Vegetables?
If you’ve eaten some cucumbers already, that’s good for you. Cucumber is a tool that supports Behaviour-Driven Development(BDD). BDD is intended to improve collaboration and development of shared understanding on a software team. If you’ve been paying attention, you may have observed that the problem of “what to test” is a problem of inadequate understanding. It is a problem that is best solved through collaboration. Even though Cucumber is a nice tool, it’s not required for practising BDD. BDD is concerned with exploring proposed changes to a system through concrete examples, discovering and agreeing on the details, documenting what was agreed, and driving the implementation using the documented examples.
This gives us a powerful way to ask important questions that’ll make test cases obvious. Testing is not the goal, understanding is. A good way to explore a user story is to ask for examples of comparable implementations of the user story. Most software have few, if any, unique attributes. A good example for the MyTube user story above would be “just as YouTube” does it. That example can clarify many things easily: if there’s confusion on any aspect of the feature, we just ask YouTube.
By applying BDD to a user story, you can unpack it into concrete examples that can be used to formulate test cases. You’ve probably heard the testing mantra, “test behaviour, not implementation”. The examples you derive demonstrate the behaviour you should test.
You Need More than Tests
Your discovery process can reveal many requirements for the system. Some of those will be non-functional requirements (NFRs) that exist to support the use cases. NFRs are often better monitored than tested. For example, the availability of a service is better determined by monitoring the service’s uptime. Performance can be tested using benchmarks, but the performance of a long-running system can be affected by other factors than usage, so it’s important to have monitoring in place. The performance of the entire system matters more than that of a single component. Document the NFRs, then set up the appropriate systems for measuring them, whether they be tests, monitoring, or both. If you try to test NFRs that don’t lend themselves well to testing, you will end up scratching your head again; you’ll be back where you started, but this time it’ll be because you are bringing the wrong tool to the job.
Deciding what to test can be difficult. This is often a result of inadequate requirements. User stories are not requirements; they are meant to trigger a thoughtful exploration of the problems posed so that the actual requirements can be understood. This exploration process is a team effort that produces a shared understanding demonstrated using concrete examples. This understanding is the key to knowing what to test and what to measure in other ways such as system and application performance monitoring.