Bloggo back to the blog
What came first? The chicken or the egg? A few thoughts about software testing Metrics-->
There are many topics which fuel debates and emotional discussions within the software testing community, but it seems that there’s one that really makes everyone excited – Metrics!
How do you measure your testing efforts? What is the best way to evaluate effectiveness, and which elements should be quantified? How do we know if the testing we perform is of good quality? and so on. There seem to be more answers to these questions than the people who answer….
As a software test management tool provider, we were always trying to figure out the best approach. We read the related literature, participated in all the right forums, discussed it with industry experts and had feedback from customers about how they see it implemented, and what would they like to measure and follow as the “best” metrics possible.
Long story short – we didn’t find not one good practice that we could comfortably say fits all of our customers, or even most of them.
We came to realize that testing metrics is subjective, and can be only decided upon on a per-company, or even a per-project basis, with all these aspects and angles being considered:
– What is the project scope?
– What is the team size? Is there more than one team for this development project?
– Is this a new project, or maybe a follow-on from a previous one?
– Are there any business measures of success of the project? How does it reflect on the testing part?
– Which methodology is used, if at all?
– Is there any previous metrics data to compare with? Do we want to compare?
– What is the available data to create metrics with?
– Do we want this metrics to define if the tested application can go out to the market? Or is there a different goal?
– Who is calling the shots in the project (QA? Development? Marketing?)
– Are we concerned about the process or only the end-results?
The list goes on. And the answers are typically very different from one project to another.
We have no intention here to discuss the different metrics options, or to cover calculations, data gathering, and how to use them. This you probably know better than us anyway :-)
We do however want to raise one thing for you to think about – do you create and use the metrics to improve your software testing, the team’s productivity and performance? or does the metrics you chose turn into a target in and of itself, altering the the way you work and the efforts you put in?
Is it the chicken or the egg?
Are you chasing your tail?
The act of collecting specific data elements and creating measurements and criteria may in turn affect the way people behave.
When something is measured people assume it is of importance, and therefore act to get better results on this specific measured element. We all want to look good, and will do anything to get ‘our numbers’ in the right place on the charts. That’s not always a good thing. It can turn to be counter-productive to the overall work a person should try to achieve.
A few might concentrate on their own targets, and ignore, even without noticing, the good of the team and the project.
There are many examples from other places around us. Take for example the “poverty line” as many countries define it (usually minimum income per family). Decision makers and politicians, who want to look good to their voters, will forget about the right thing to do or the long term objectives, and will try to line-up the numbers so they’ll show decrease in the number of families under that line, only for the time of the measurment. In many cases it means more poor people in the long-term :-(
Or take a police unit, which gets awarded for generating more revenue. Do you think they’ll chase the thieves? Forget it and expect many traffic tickets in that area…
The same can happen to your testers. And you might not want that.
Twisting the tester’s job
Lets take a simple testing metric that many, unfortunately, seem to adopt. Measuring a tester by the number of defects found.
The main reason that companies like it is because they believe that the tester’s job is to find defects, so they guess that measuring it will give a good view of the job quality. More defects identified early would mean early fixes, and better end product.
However, we believe that in many cases the opposite might happen. This metric, as an example, might cause the tester to simply report more minor defects, to split a possible defect into a few different defects, and to avoid the effort of finding defects that require a long time of investigation. They also might give less attention to describing the defects in details in order to save time, better spent at identifying more defects. This might have a knock-on effect on development, or product release. Developers need to deal with more issues, some of them may turn out to be non-issues, or very minor.
This example illustrates how hard it can be to define a metric that doesn’t affect the tester’s job, and the way they will perform it. In most cases the effect will be negative and not positive.
There are a few ways to handle this ‘human effect’ though –
– Measure a process, a product, not people
– Do not use metrics as a way to follow up on people or worse – as a way to measure them.
– Make sure the group understands the logic behind the measured data, its goals and that they are involved in creating it from the design stage. Get their commitment.
– Don’t use one or two metrics (and not hundreds either). Try to have a manageable set of figures that give you a more complete picture of what you are trying to achieve.
– Metrics are a good decision and management aid, but don’t let them replace your own human judgment or gut-feeling. Trust your instincts and team feedback, and not just the statistics.
BTW – Following exactly these thoughts, we have decided to give our customers a way to export the raw data so that they could figure out themselves, what and how they want their own metrics to look like.
First Published in ‘Software Testing Club’ on Nov. 2010