No real data, no real risk? Circumventing legal privacy concerns in Artificial Intelligence/Machine Learning models with synthetic data

31. Oktober 2023

Since the introduction of the GDPR in 2016 at the latest, the focus on (personal) data protection has increased rapidly, even if many practical implementations remain unclear. In 2023, with AI and Big Data Analytics on everyone’s lips, there are also numerous other European legal acts coming up (e.g. Data Governance Act, Data Act) that on the one hand want to extend protection, but on the other hand want to ensure easier availability of collected data.

In our research project OPTIMA[1]at the Department of Innovation and Digitalisation in Law we are addressing the legal concerns in the development of a clinician’s decision-support-tool, where patient data can be entered and AI/ML is used to suggest the most suitable treatment for cancer patients.

OPTIMA’s mission is to design, develop and deliver the first secure, sustainable, interoperable and GDPR-compliant European real-world oncology data and evidence generation platform based on the needs of clinicians and patients in order to support them in their shared decision-making process. The aim is to utilize large amounts of already under the legal framework available medical patient data for implementation to obtain the largest possible of different data sets that will be used to investigate the correlations.

With the use of medical data came concomitant privacy concerns. Since the current legal framework, first and foremost the GDPR, refers to personal data, which in any case includes medical data of patients from electronic health records, medical images and clinical notes, etc., processing is only possible under restrictions. Article 9 GDPR starts with a prohibition on processing these special categories of data, including genetic data and data concerning health and then carves out specific instances in which such processing is permissible, for example when the data subject has given their (explicit) consent, which could serve as the legal basis for OPTIMA.

Hence, an emerging method is to use synthetic data to minimize concerns and risks for collected data and at the same time to be able to guarantee the best possible therapy for the individual. Synthetic data is artificially created by AI algorithms based on real data, rather than being sourced from actual human interactions or “real-world” events. When generated effectively, this synthetic data closely mimics statistical patterns and characteristics of the original dataset. This approach can help alleviate privacy concerns and facilitate more open sharing and utilization of data. It has garnered significant interest as a means of implementing differential privacy measures for healthcare data, often being lauded as a potential remedy for various privacy and data protection issues. In principle, this has the potential to facilitate greater transparency in data sharing, to enhance the resilience of models, and to minimize the potential risks to individuals whose data is employed. Furthermore, synthetic data offers the opportunity to introduce diversity into pre-existing datasets.

However, the difference between newly generated synthetic data and pseudonomized data is unclear. Although synthetic data offers significant promise, the differentiation between real and synthetic data still lacks the clarity required to effectively mitigate privacy concerns. In addition, while this allows for circumvention of personal data, such data may not be exempt from the other European legal acts. The next few years of the project will therefore reveal how the legal requirements at the European level will continue and how we, as the Department of Innovation and Digitalisation in Law, will contribute to researching and framing the legal development in which innovative technologies find their place.

[1] OPTIMA is funded through the IMI2 Joint Undertaking and is listed under grant agreement No. 101034347. IMI2 receives support from the European Union’s Horizon 2020 research and innovation programme and the European Federation of Pharmaceutical Industries and Associations (EFPIA). IMI supports collaborative research projects and builds networks of industrial and academic experts in order to boost pharmaceutical innovation in Europe.

Autorin: Katja Hartl, University of Vienna, Department of Innovation and Digitalisation in Law

Kommen Sie zur diesjährigen Legal Tech Konferenz!
Es erwartet Sie ein spannender Tag rund um das Thema Digitalisierung und Legal Tech