Track Talk, W15

Using AI to Create UAT Tests, a Case Study from BBC Radio

Bill Watson

14:45 - 15:30 CEST, Wednesday 17th June

AI seems to be everywhere, the technological zeitgeist, but just how effective is it in generating test cases? I wanted to see if we could come up with some actual data on this, but being able to run different tests on the same “fresh” software was problematic.

However, this changed with a project to upgrade the pan BBC radio playout system, used in multiple locations across the UK. This means some 350 new test cases need to be written and then tested at different radio stations with different hardware configurations.

This allows us to set up a study with three different sets of tests then run them concurrently: two sets of AI-generated tests and one control set using our current methodology.

As we will be running these tests multiple times in different locations, we can now create a study and compare different KPIs to see what AI can and cannot add to our test case creation process. KPIs being measured include time to create test cases, time to run tests, and defects found (and not found) by each technique.

At the end of the project, we will have metrics on what has worked best in terms of effort of input against goals achieved, as well as having fully explored how to best use AI for manual test creation.

This talk will cover the methodology we are using, including how we decided what metrics to focus on, the results achieved, and details on what we establish as best practice for creating UAT tests with (or without?) AI assistance.

This is an ongoing study, so at submission time we cannot predict what the results will be. Whatever they are, the journey to getting them is the key, laying out an approach to evaluating how AI can work in testing.