Programme launch offer - Save 20% Book Now

Tutorial D

Let’s Test an LLM

Michael Bolton

9:30 - 12:30 CEST, Tuesday 3rd June

“We shape our tools; thereafter, our tools shape us.”  —Marshall McLuhan

At the time this description is being written, there’s heated controversy over using AI in testing. Some people enthusiastically promote Large Language Models for developing test ideas, creating test strategy documents, or generating test data, and claim that these tools will dramatically improve testing. Others point to serious problems in the quality of the output the LLMs produce, the effort required to validate that output, and the way the tools influence the work.

Your organization may be considering applying LLMs in testing. As a responsible tester, you need to be able to tell the difference between valuable tools and vendor hype.

Let’s take a half day to learn how to examine and test the claims being made for LLMs in testing. To do that, we’ll go through the process of developing and performing tests: evaluating a product by learning about it through experiencing, exploring and experimenting.

We’ll use an LLM in a couple of testing tasks — test design and data generation — and we’ll evaluate the results. Those of us who are comfortable with coding or tools will use them to help perform the analysis. (If you’re not a coder, that’s okay).

We’ll apply techniques (currently) being touted to improve the quality of the output. We’ll go beyond a simple evaluation or demonstration, though — because while the output is important, our primary task will be to evaluate how the LLM affects the nature of testing work itself.

This will be a challenging exercise, because we will be taking testing seriously. We will not simply provide an analysis of the LLM’s output; that’s just one part of a test. We’ll also develop two more essential elements in a comprehensive test report: a story on how we tested; and a story on the quality of the testing work itself, including threats to the validity of the test.