go back to the blog

PART 3: Anti Regression Approaches: Impact Analysis and Regression Testing Compared and Combined by Paul Gerrard, Gerrard, Consulting, UK

  • 03/06/2010
  • no comments
  • Posted by EuroSTAR

Please note – This is part 3 of the article – If you missed part 1 and/or part 2, please scroll down.

Part III: Regression Testing

In Part I of this article series, we looked at the nature of regression and impact analysis. In Part II we looked at regression prevention using impact analysis and regression detection using static code analysis. In this article, we’ll focus on (dynamic) Regression Testing and how we select regression tests. The next article (Part IV) looks at regression test automation.

Impact Analysis Informs Regression Testing

The impact analysis should have been performed from both a technical and a business point of view. These two viewpoints provide a different focus and inform the regression testing choices to be made. The starting point of these analyses is, of course, the prospective changes. The nature of these changes (whether they are a change in technical environment, an upgrade of an infrastructure component or a software change in application code) will help us to focus attention on different aspects of the affected system.

From the technical point of view, the proposed changes are hazardous because they may cause an unwanted change in behaviour (a regression) in a single component, multiple components that share some common factor, or a large proportion of the larger system (or even interfacing systems). The deeper the technical analysis, the more clearly understood these potential regressions will be. The potential regressions can guide the choice of when and what type of tests are selected, designed and executed. Unfortunately, confidence in these predictions is often low and the value of the technical impact analysis is limited.

From the business point of view, the outcome of the technical impact analysis will help business experts to focus attention on specific features of the system and the business processes that depend on them. The better the technical impact analysis, coupled with a good understanding of the architecture of the system and the way the system is used, the easier it will be to describe the potential risks of regression and therefore to focus and prioritise the regression testing using a risk-based approach.

Regression Test Approach – Some Considerations

There are several consideration when deciding how to approach regression testing. The objective is to acquire a flexible set of regression tests (a regression test pack) that addresses our selected regression test objectives. We will discuss these choices separately but there are some important connections and dependencies between these choices that must be taken into account when selecting your approach.

At what level(s) do we regression test?

In general, there are three levels of regression testing.

1. Component Level. The challenge is: does unchanged functionality behave identically? (At a component level it should be predictable which components will be affected by a change). Tests must demonstrate that, post-change, and at a component or at component interface level, the software behaves exactly as it did before the change. The scope and coverage of testing is driven by the technical impact analysis.
2. System (or Sub-System) Level. The challenge is: is existing integrated functionality affected adversely? (At this level, it may not be known in advance whether functionality is affected by the change). The system (or sub-system) behaviour is either unchanged, or its changed behaviour must be acceptable (whether or not the change was predicted by impact analysis or design). The scope and coverage of testing is driven by both the technical and the business impact analysis.
3. Business (or Integrated System) level. The challenge is: ‘can we still do business (effectively and efficiently)?’ In this case, the subset of functionality used by end users to conduct their business is tested. The scope and coverage of testing is driven by the need for the business to have confidence that it can still operate the system effectively regardless of the changes made. (The amount of testing at this level is determined by several factors, including the level of trust that a business puts into its software supplier!)

Depending on context, the emphasis on each level varies.

Where a software supplier has a test-first approach and has an automated, continuous integration regime in place, level 1 is the obvious focus. If component testing is weak or manual or not repeatable, but it is possible to automate testing through the API or automated/manual testing through the user interface is practical then level 2 may be the main focus. In environments where development is outsourced or software supplier testing is immature or regression problems are common, level 3 testing may be the main (or only) defence against regression. Some level 3 testing, whether manual or automated is inevitable as it is usually the only testing that provides tangible evidence that a system can continue to support business objectives.

Note that levels 2 and 3 may be merged for a software product company because the system testers will be guided by product managers and will have to represent product ‘proxy’ customers.

Where do our regression tests come from?

Since regression tests may be run many times during the lifetime of a system, it is obviously preferable to construct these tests and maintain them as a valuable asset rather than create them on an ad-hoc basis for each release (and then throw them away). The most common approach is to retain some or all tests of the original system development project and maintain them as an asset as the system evolves.

However, although the ‘retained tests’ approach is common, it is possible (and perhaps likely) that the tests created for the first version of a system may not align with your regression test objectives. For example, the regression test objective might be to focus on ‘straight-through processing’ functionality. These tests may be difficult to extract directly from your system test pack if there are (so called) negative tests, input-validation tests or ‘weird-paths’ embedded in end-to-end tests. Further, as the system evolves, the focus of regression testing may change as experience of regression is gained. Patterns of regression may emerge and the emphasis of the regression test pack may change to give more coverage to regression hotspots and less coverage to more resilient functionality. It is a good idea to review the coverage of retained tests in the light of previous experience and the impact analysis for each significant release of the software.

What is our coverage model?

The coverage goals of regression tests vary with the levels identified above.

At Component level, a rule of thumb that has been successfully used for large scale migration projects might be appropriate. A regression test pack that achieves 80% branch coverage across an entire system should provide sufficient confidence that a changed system was functionally equivalent. Of course, you need coverage analysis tools to operate such a regression test regime.

At System level, the goal would be to assure that the critical features and end-to-end processes needed by the business users are covered by tests. The coverage target would typically include the key transactions (and data variations) used in the business. Although selected paths through the business process may be used to cover these transactions, the regression test pack may be a mix of component level, integration and system tests. Most sub-system tests would be automated and driven through an API using some form of test harness or framework. System-level tests may be manual or automated through the user interface.

At the Business level, tests are usually manual and focus on the critical paths through the business processes, exercising functionality as required. Often, two and three dimensional matrices are used to manage coverage across business process flows, business/organisation units, products, services or technical platforms as appropriate to ensure coverage of the variations of usage in production.

One valuable analysis can contribute greatly to confidence here. The data in production systems can be analysed or audited to identify ranges of values used in selected fields, the proportions of different types of business or transactions performed, the most popular combinations of values of key fields, the volumes of high and low value processes and so on. These analyses can drive the selection of test data for regression testing to ensure coverage of the business activities. Of course, these analyses cannot provide information on invalid data that is never captured in a database.

How will regression tests be selected and prioritised?

Very often, where existing manual tests are to be used as the basis of regression test pack, there is limited time available either to automate these existing tests or to run them manually. The selection process happens at three levels:

Which tests should be selected/created for the regression test pack?

This is driven primarily by the need to achieve some level of coverage, either at the code level (for component testing), interface or key transaction or business process level for system and business levels. Clearly, the most critical business processes and transactions must be covered in the overall regression test pack.

Which tests in the regression test pack should we automate?

The drivers for this choice are, as always, greater efficiency and effectiveness. Is it more efficient and effective to automate at a component, system or business level? Developers are the best equipped to create automated, high volume tests but may not have the motivation, process or incentive to do so. Users may be most assured by seeing automated tests of the user interface but automating such tests can be expensive and troublesome. Selecting and implementing the ideal automation approach is a challenging problem which will be discussed in the next article.

Which tests in the regression test pack should be executed in this release?

The time available for regression testing is usually limited. Automated test packs can usually be run quickly and in a predictable timescale so they may always be run as a complete pack. Manual regression testing however, is labour intensive and time consuming so the ability to prioritise is essential. The output of the impact analysis may highlight potential regression patterns so tests of specific components, interfaces, calculations or transactions can be included in the regression test schedule. From the business point of view, tests of the most critical processes, transactions, interfaces, reports and reconciliations will drive their choice of what is in the schedule. It is common for the users to have a ‘minimum test pack’ that must always be run. Needless to say, the regression tests at all levels should be structured so that meaningful selections of tests can be made efficiently. Of course, the prioritisation must take account of the cost of execution, so it is very helpful to know from past experience how long the candidate tests will take to execute.

In the next article in this series, we will look more closely at the automation choices available and some techniques for making automated regression testing effective, efficient, more manageable and informative to stakeholders and management.

To be continued…

Blog post by

go back to the blog


Leave your blog link in the comments below.

EuroSTAR In Pictures

View image gallery