Derive Good Test Data from Production Data Without Breaking Privacy Laws

W4     Start Time : 10:15     End Time : 11:00

Talk Abstract:

Good test data is the very foundation of good testing. But good test data is hard to get. If you create it manually or build a script or program to generate test data, the test data will probably reflect your understanding of- and expectations to production data rather than the actual properties of the production data. For that reason, it is unfortunately not uncommon to use production data or data trivially derived from production data for testing.
Using production data for testing has problems of its own. GDPR (the new EU privacy lay) applies to such data. It obviously applies when using production data directly. But surprising to many, GDPR also applies in almost all situations when test data is based on scrambled or anonymized production data.

Overall content of the talk:
– The importance of good, representative and “fresh” test data and the importance of fast, cheap and low-friction access to the test data.
– Metrics for test data (how to measure test data quality).
– Which are the compliance and security challenges (GDPR, Segregation of Duties, data loss prevention, corporate policies, etc.).
– A helicopter view of the most relevant articles of GDPR.
– A helicopter view of the techniques that can be used to protect data, such as anonymization, pseudonymization, synthetic data, tokenization, and format-preserving encryption.
– Strategies for generating test data while respecting privacy and security.
– How to ensure GDPR compliance.
– What to do and where to start.

Martin will also make sure to address some of the most prominent and serious misconceptions, such as that many believe that data can easily be anonymized (and thus get out of GDPR scope) and that hash function can ensure privacy.

Without good test data, your test is not representative to the real-life production situation.

Key Takeaways:

  1. Inspiration on how to improve test data quality.
  2. Your existing solutions for generating test data from production data is most likely not GDPR compliant.
  3. What it takes to generated high-quality test data from production data in a compliant way.

Back to Programme

  • Speaker

  • Martin Boesgaard - CEO and Founder, PII Guard, Denmark

    Martin has developed and implemented several enterprise software solutions over the years. On top of that, he is expert in cryptography, information security and privacy after having worked in the area for almost 20 years. He is the author of 10+ scientific papers and 10+ patent applications.