Following the introduction of stricter data protection and privacy regulations such as GDPR which came into effect in 2018, the necessity of protecting personal data has become increasingly critical.
As a result of these stricter regulations, organizations are obligated to implement data masking or synthetic data generation techniques in their testing processes to ensure adherence to privacy standards. Yet, this leads us to a critical question: Are these methods truly sufficient to safeguard sensitive data?
This talk will examine the hidden dangers of training AI models on sensitive data, where synthetic data generation may create a false sense of security. We will explore how attacks such as model inversion and data inference can expose private information, even when using synthetic data.
We will also examine how these techniques allow attackers to reverse-engineer AI models, reconstruct sensitive data, or infer personal information, compromising the privacy that synthetic data generation promises.
This session will investigate real-world cases of privacy breaches in AI and machine learning systems, highlighting how complex attacks can expose sensitive information, even when synthetic data is involved.
Finally, we will investigate privacy-preserving techniques, including differential privacy and cryptographic methods, among others, designed to mitigate these risks and ensure the security of synthetic data.
The talk intends to offer significant understandings of the complexities and risks associated with synthetic data generation with AI and machine learning systems. The aim is to foster a mindful approach to responsible AI development by shedding light on the vulnerabilities inherent in AI synthetic data generation technologies.