Metrics That Matter for GenAI Evaluation

What you will learn

Understand why traditional metrics such as accuracy, precision, and recall are not enough for evaluating generative AI
Explore new dimensions of evaluation, including reasoning, factual consistency, fairness, robustness, and alignment with business goals
Discover actionable strategies for designing KPI frameworks that balance technical performance with business impact

Session Details

Intermediate
45mins
Includes 15min Q&A
Testing the reliability, fairness, and safety of AI models

Buy Conference Ticket

Check out this video from Anamika Mukhopadhyay and Deepshikha to hear more about their talk on Metrics That Matter for GenAI Evaluation.

We look forward to welcoming you to EuroSTAR 2026.

Session Speakers

Anamika Mukhopadhyay

Nagarro, India

Anamika, a QA consultant with 10+ years in software testing and automation, leads Nagarro’s Emerging and Special Tech Practice. Passionate about inclusivity and equality, she explores AI-driven innovations to advance testing. Her focus is on delivering sustainable, accessible, high-quality solutions that empower diverse users across the evolving technology landscape.

Co Session Speaker

Deepshikha

Nagarro, India

Deepshikha, with 10+ years in software testing and quality engineering, heads Nagarro’s AI in Testing practice. She drives sustainable, AI-powered automation strategies that boost efficiency, optimize processes, and speed up delivery. Her expertise empowers organizations to achieve reliable, high-quality software while enhancing business value, performance, and long-term impact.

Stay in The Loop

Subscribe to our newsletter and never miss important announcements, updates and special offers from EuroSTAR.

LinkedIn
This field is for validation purposes and should be left unchanged.
Name*
First Last
Email*
Job Title*
Years in testing*
Company*
Country*
GDPR*
- I would like to subscribe to updates from EuroSTAR Software Testing Conference
ActiveCampaginChecker
CAPTCHA