The processing of personal data is caught between technological progress and strict data protection requirements. The General Data Protection Regulation (GDPR) requires the protection of individual rights, but also allows innovative approaches through anonymization. The Stiftung Datenschutz has published a practical guide to anonymization that provides a sound basis for anonymizing personal data securely and in compliance with the law.
Definition and meaning of anonymization
Anonymization refers to the conversion of personal data into a form that no longer allows conclusions to be drawn about an identifiable person. The decisive factor is that the data is no longer subject to the scope of the GDPR after anonymization.
Anonymization is often used in order to be able to use data for research purposes, market analyses or software tests without violating data protection rights.
The distinction from pseudonymization is essential, as pseudonymized data is still considered personal data and falls under the GDPR.
Requirements for effective anonymization
The GDPR does not require a specific method of anonymization, but defines requirements indirectly. It is essential that the data cannot be assigned to a person by the controller or third parties with reasonable effort. Factors such as costs, technological possibilities and the probability of re-identification play a central role here.
Inspection obligations:
- A controller must be able to prove that the anonymization is practically irreversible.
- Indirect identification features such as gender, date of birth or zip code must be carefully checked to rule out any conclusions.
Anonymization methods
The guide describes several methods that can be adapted to specific requirements:
- Randomization: Data is alienated by random changes to the values in order to prevent conclusions being drawn.
- Generalization: Values are transferred into larger categories, e.g. the aggregation of age data into age groups.
- Differential Privacy: Usage-dependent biases protect individual data points in aggregated data sets.
- Synthetic data: Artificially generated data that is statistically similar to real data but has no personal reference.
Reading tip: Cookie consent management - secure consent for companies
Legal challenges of anonymization
Processing operation as part of the GDPR
The anonymization itself is considered processing of personal data and is subject to the requirements of the GDPR, including a legal basis. Only after anonymization has been completed does the applicability of the data protection regulations to the anonymized data cease to apply.
Re-identification risks
Particular attention must be paid to the possibility of anonymized data becoming personally identifiable again through external information. The use of an "attacker model" for risk analysis is essential here.
Integration of third parties
Third parties, such as processors, who carry out anonymization must comply with strict contractual regulations in order to guarantee the security of the data and the independence of the anonymization process.
Possible applications and examples
The guideline describes four central application classes:
- Anonymization as deletion: Replacement of the deletion of personal data by anonymization, for example in the case of application data after completion of a selection procedure.
- Disclosure of anonymized data: Salary benchmarks or sales data that can be passed on in a legally secure manner after anonymization are described as examples.
- Anonymization in the training of algorithms: Techniques such as federated learning make it possible to use data without it being centralized or depersonalized.
- Anonymization for software tests: Synthetic data ensures that tests can be carried out without access to real personal data.
Anonymization offers companies and institutions the opportunity to use data efficiently and in a legally compliant manner. The data protection foundation's practical guide shows that this is not only a technical challenge, but also a legal and organizational one. Those responsible must not only carefully plan and implement anonymization processes, but also be able to prove that the measures meet the high requirements of the GDPR.
The clear distinction between anonymization and pseudonymization is crucial in order to minimize legal risks and at the same time exploit the full potential of the data.