Testing for Cognitive Bias in AI: Why Machine Learning Applications Are Like People

We would like to think that AI-based machine learning systems always produce the right answer within their problem domain. However, in reality their performance is a direct result of the data used to train them. The answers in production are only as good as that training data.

But data collected by human means, such as surveys, observations, or estimates can have built-in human biases, such as the confirmation bias or the representative bias. Even seemingly objective measurements can be measuring the wrong things, or can be missing essential information about the problem domain.

The effects of biased data can be even more insidious. AI systems often function as black boxes, which means technologists are unaware of how an AI came to its conclusion. This can make it particularly hard to identify any inequality, bias, or discrimination feeding into a particular decision.

This presentation explains how AI systems can suffer from the same biases as human experts, and how that could lead to biased results. It examines how testers, data scientists, and other stakeholders can develop test cases to recognize biases, both in data and the resulting system, and how to address those biases.

  • Speaker

  • Peter Varhol - Director of Practice Strategy, Kanda Software, United States

    Peter Varhol is a well-known writer and speaker on software and technology topics, having authored dozens of articles and spoken at a number of industry conferences and webcasts. He has advanced degrees in computer science, applied mathematics, and psychology, and is Director of Practice Strategy at Kanda Software. His past roles include technology journalist, software product manager, software developer, and university professor.

  • Co-Speaker

    Gerie Owen - Qualitest United States

    Gerie Owen is Vice President, Knowledge and Innovation-US at QualiTest Group, Inc. She is a Certified Scrum Master, Conference Presenter and Author on technology and testing topics.