Data deletion
Data deletion is one of the simplest anonymisation methods. It is the deletion of sensitive data from a dataset. This can be done in cases where the sensitive data is no longer needed, does not need to be updated and will no longer be used in data analysis. This method can be conveniently used for direct identifiers, while other anonymisation methods may need to be used for indirect identifiers.
Example
The original data contains direct identifiers (name, surname, personal identification number, telephone number and student ID number) and indirect identifiers (age, city, diagnosis). In this example, only the direct identifiers can be deleted as they will not be used in the data analysis; otherwise, almost the whole dataset would have to be deleted.
Original data
| ID |
Name, surname |
Age |
City |
Personal code |
Phone |
Diagnosis |
| 101 |
John Berzins |
35 |
Sigulda |
120390-***** |
29123456 |
Hypertension |
| 102 |
Anna Kalnina |
28 |
Ape |
040795-***** |
26789012 |
Diabetes |
| 103 |
Peteris Ozols |
40 |
Dobele |
3150882-***** |
22334455 |
Migraine |
| 104 |
Laura Linden |
32 |
Suntaži |
080188-***** |
26543218 |
Multiple sclerosis |
Deletes columns containing direct personal identifiers.
Anonymised data (after deletion of direct identifiers)
| ID |
Age |
City |
Diagnosis |
| 101 |
35 |
Sigulda |
Hypertension |
| 102 |
28 |
Ape |
Diabetes |
| 103 |
40 |
Dobele |
Migraine |
| 104 |
32 |
Suntaži |
Multiple sclerosis |
Sometimes, even after the deletion of direct identifiers, it is possible to identify a person, for example by other unique features or a combination of features. In such cases, the use of another anonymisation method or the deletion of the unique value of a variable should be considered.