The idea of test coverage is a bit of a holy grail in the software testing world. When unit testing you’ll often hear about a certain percentage of the code being covered, and with higher level testing you will often hear questions about how well we have covered a feature. As with most things the idea of coverage is a fuzzy idea, but one of the most important lessons I’ve learned about it (from Dorothy Graham in this talk) is to ask the question ‘of what?’ Whenever we realize we are talking about coverage we should be thinking about what we are trying to cover.
It is a very helpful to realize that there are many many forms of coverage. We can never cover all the coverages which is why complete testing of any non-trivial software is impossible, but in some cases we can theoretically cover all (or most) of one type of coverage. We could for example theoretically get complete line coverage of a code base.
I realized recently though that knowing how to get complete coverage of a particular area can be a bit of double edged sword. Just because we know how to cover something completely, doesn’t mean we ought to. In fact, sometimes even using sampling mechanisms like combinatorics testing doesn’t make sense.
I was recently trying to test something that involved the ability to create ‘expressions’ according to certain known rules and inputs. It was seductive. I started to make a table of the different ways things could combine together to create different combinations. I quickly realized that in an n x m matrix like I was dealing with there were many millions of possible combinations and so I started putting in some sampling heuristics to try and reduce the problem space. As I kept going down this path of creating possible expressions, I eventually realized that I might not be effectively using my time. Sure I was using a script to help me power through these combinations, but there were little tweaks and changes that needed to be made and then for each combination I would have to run the software for a few seconds.
It was going to take days to get through this all. Was it worth it? When I stopped to think about the idea of ‘coverage of what?’ I realized that perhaps I was focusing in on an area where the value of my coverage was low. There were many other aspects of coverage of this feature that I was not considering because I was so focused on this one area of coverage. The reality was that the ability to get a high level of coverage in a certain area had seduced me into spending too much time in that area. Just because I can do something and I know exactly how to do it, doesn’t mean it is the most valuable thing to spend my time on. I had to leave that area with lower coverage and focus in on other areas instead because the reality was there was a much higher risk of finding problems in those areas.
This is one of the challenges of having measurable coverage. Some types of coverage are much harder to measure but that doesn’t mean they are less important. When we have a particular area we can measure it can give us goals to work towards. This can be helpful, but if we let it drive our thinking too much we can easily be doing low value work to meet a coverage goal in the place of some much higher value work on other aspects of coverage. I think we all tend to bias towards what we know and understand, but don’t forget that it is often in the less explored areas that the nugets of gold are to be found.
Testers are kind of like fortune tellers. We need to be able to predict the future, or at least it ought to seem that way.
One of the things people joke about is how mean we testers can be to the product. We find ways to break things that can be quite surprising to others on the team. I like to think of that not as being a dream wrecker, but as being a fortune teller. How did I find that issue? Well, I went into the future and thought about what kinds of things the users might do. The kinds of things that require understanding the way humans think and interact with software. Humans use software to help us accomplish things, but we don’t always do so in a linear fashion of in the ways that those of us who design think we will.
Like a good fortune teller, we testers understand how humans tick and what biases and flaws we have. We know how to size up a user and anticipate how they will react to the system we are working with. We know how to make reasonable inferences from small amounts of information. We know where users are going to stumble and where they are going to be frustrated. We know all this because we pay attention to both the human element and the technical element. Software testing sits squarely at the intersection of humans and technology and so as testers we study both. We understand the technology and we understand the humans, but most of all we understand how they interact with and influence each other.
It may seem like what we do is magic, but much like a fortune teller, it come from years of practice and study. We have experimented and honed our skills. We have made predictions and seen where they have been wrong and we have learned from that. We have been students of our craft and so it can seem like what we do is easy or magic, but the reality is, it is experience, study and practice that has brought us to this place.
In a data driven world, it may seem like we don’t need these skills anymore. Who needs to be able to predict the future when we can react to it in real time? But who is going to ask the questions that need to be asked? Who is going to figure out what data to gather? Who is going to be able to look at that data and understand the thinking of the humans behind that data? The data driven future is not a place where there is no need for these fortune telling testers. It is a place that will see their skills leveraged in ways that will allow for astounding and amazing things to happen. It is a world in which testers will be able to move from the fortune teller’s booth at the fair to the big stages of Penn and Teller. A world in which testers will have resources and data that will allow them to use their skills to bring new value and insights to projects in unanticipated ways. A world world that will open up new vistas and opportunities as these skills are partnered with new technologies and insights.
It’s a world I look forward to.
We were recently discussing this article at a team meeting, and as part of that discussion we were talking about some of the inconsistencies in our product. One area where we have inconsistencies is in how different parts of the product handle the data coming from the UI. Depending what kind of problem you are looking at we have radically different paradigms for how we manage that data before sending it down to the low level engines. At the UI level the product looks fairly consistent, although once in while these under-the-hood difference do show up, but in the data management layer it’s a whole different story.
There are clearly inconsistencies in our product but is it inconsistent in a way that matters? From the end user perspective it is fairly consistent, but then once you get into the data management layer there are some very big inconsistencies. Does this matter? Should we worry about making it consistent? Well, one of the things that struck me during this discussion was that we were in a group of testers who worked on different areas of the product and we would each struggle to do deep testing if we were to switch areas of focus. I think one of the main reasons it would be difficult for us to move effectively from one area of the product to another is the inconsistencies in the data management layer. So does this inconsistency matter? I would argue that yes it does. In this case it is affecting the testability of the product.
There are many ways in which this kind of inconsistency in the product hurts us. Let me just rattle off a few of them. The automated tests look very different as you move from one area of the product to another. The testers end up somewhat tied to a particular area of the product leading to less cross pollination of ideas (although we are making deliberate moves to learn new areas). When we add new features that are used by multiple areas in the product, the testing of these is greatly increased because we have to check how it works with each of the areas. It is much more difficult to test shared features like this than it would be if we had a common data management layer. The inconsistencies below the hood on our product certainly affect the testability.
There are initiatives under way to help consolidate some of the data management layer and hopefully this will help with some of the inconsistencies, but I wonder in the meantime what we as testers can do about it? I think one of the main things we can do is to learn how these various areas work and how they are inconsistent. We can then use this information in our areas of expertise to talk with developers about the kinds of things that other groups do. We can be the stitching that pulls the various areas together. Another thing we can do is ask questions. How do other groups handle this? What have other teams done to deal this problem? By asking some questions like this we can help people to think about consistency as we move forward.
Testers need to be advocates for testability in the products we test and sometimes that also means being an advocate for consistency. How do inconsistencies in your product affect the testability?
I got an interesting chat the other day from one of the developers on my team. He wanted us to attach testing results to defects so that we could prove what testing had been done as part of the defect fix. My first question was to ask what was motivating this change. In the course of the conversation it came out that some of the more senior management had gotten upset about automated regression test passing rates dipping due to changes from some defect fixes and so the developers wanted to be able to point to something that ‘proved’ what testing had been done. What they basically wanted was some way to cover their butts.
The proposal was that we attach results of test runs to defects. My ‘I hate paperwork’ flags where up all over the place and we had a conversation about how ISO 9000 compliance does not mean heavyweight, paperwork driven processes like this. We will also be continuing this conversation in a full team meeting, but at the end of the day the part about this that I really don’t like is the idea that testing is about covering your butt. I think that completely misses the entire point of what makes testing valuable. I am not in the butt covering business. I’m in the thinking tester helping to get good quality product shipped more quickly business. I’m in the helping fix the problems that are hurting our quality business. I’m in the do what it takes to get our builds running smoothly business. I’m in the help you understand the risks better business. But I most assuredly am not in the butt covering business.
We will continue the conversation and talk about how to best deal with the concerns senior management has about test pass rates. We will talk about what the real risks are to the product and how we can communicate those to senior management. We will talk about what things we can do to ensure that we are indeed doing adequate testing on defect fixes. We’ll talk about all these things, and perhaps the answer will even be that we need to do more paperwork (I’m trying to keep an open mind here), but if that is the answer, I want us to do it because it helps us do our job better as a team and not because it covers our butts.
It is ok to make mistakes sometimes. It is even ok to get in trouble once in a while for your goofs. What is not ok is doing a lot of work for something that does not add any value to the product. Don’t worry about covering your butt. Worry about being good at producing high quality software.