Automated Testing of Large Language Models: A Live Demo

What you will learn

A structured approach for evaluating LLM based applications
An understanding of RAG, and the LLM-as-judge technique
Understanding how a human-judge baseline can be used to pick a suitable LLM-judge model
An appreciation for the challenges involved with using LLM-judges

Session Details

Introductory
45 Minutes
Includes 15mins Q&A
Testing the reliability, fairness, and safety of AI models

Buy Conference Ticket

Session Speakers

Anupam Krishnamurthy

Head of AI Testing – TestSolutions, Germany

Anupam is Head of AI Testing at TestSolutions. In the past, he has served in several roles spanning space science, software development, strategy consulting and process automation. As a manager turned software engineer, he continually chips away at the boundaries that separate the two disciplines. Anupam is currently uncovering testing principles that are applicable to AI augmented software. The non-deterministic nature of AI software, its stochastic behaviour and the need to evaluate subjective outputs present new challenges to the field of software testing. When he is not stuff like this, you’ll find him engrossed in a game of online chess, running long distances with a History podcast, or curled up in a corner with a book.

Stay in The Loop

Subscribe to our newsletter and never miss important announcements, updates and special offers from EuroSTAR.

Comments
This field is for validation purposes and should be left unchanged.
Name*
First Last
Email*
Job Title*
Years in testing*
Company*
Country*
GDPR*
- I would like to subscribe to updates from EuroSTAR Software Testing Conference
ActiveCampaginChecker
CAPTCHA