Past Presentations

go back to presentations

Introducing Operational Intelligence into Testing

Have you ever experienced that software, which was well tested, caused many or severe incidents straight after go-live? When I joined an operations team and became responsible for cleanup or aftercare after releases I certainly have. The engineers in our team would often ask themselves, how could the testers have missed this? Initially I defended my test brethren with the usual arguments until I realised: they had a point. Many issues could have been easily found and therefore should have been found, had they used the same analysis and monitoring techniques employed by operations.

Now that we have service oriented and complex chains of applications, predicting the full data flow of a business process beforehand is a near impossible task. The standard approach of expected versus actual result doesn’t suffice. In operations they don’t predict, but they do detect issues without first describing an expected result. They dig their way through log files and other ‘machine information’ to find issues and root causes. The latest development in operations by using big data tools like Splunk for operational intelligence, make this process of gathering, correlating and analyzing machine information much more efficient, easy and powerful.

Wait, isn’t our job as testers to find issues and have them solved before they materialize in production? This is why we are now implementing the tools and techniques and even people of operations in test. Our initial results are very promising. Root cause analysis in test is more effective, underlying issues are detected quickly making the test process itself faster. We expect the quality of the delivery to benefit equally.

In this presentation, Albert explains this, demonstrates a tool to show how this really works and most of all, discuss our experiences.

EuroSTAR In Pictures

View image gallery