General Data Protection Regulation (GDPR) is the privacy regulations set forth by the European Union (EU) for the collection, storage and transmission of personal data of EU citizens and residents. The law broadly expands the definition of personal data beyond personal health information (PHI), which can be used to directly identify an individual, to also include sensitive personal data, such as race, religion, politicali party, and sexual orientation, and other identifiers that in combination can be used for indirect identification by inference. Within the United States, the Health Insurance Portability and Accountability Act (HIPAA) limits privacy to just PHI.
Learn more about GDPR and HIPAA here
Digital Imaging and Communications in Medicine (DICOM) is the standard to store, exchange and transmit medical images (CT, MR, US, x-ray, etc.) and contains a wealth of personal information collected and stored to ensure minimum risk of patient mix-up. As important as matching medical data to patients is, personal data must be protected and handling of medical data must be compliant to all privacy laws. DICOM images used for research, with the consent of patients, must be de-identified by removing all potential personal data before use.
The DICOM de-identification is performed with a Python script that will remove values of specific DICOM tags listed in DICOM_tags.txt. This list is simply a text file which can be easily modified by adding more tags or deleting existing ones. De-identified DICOM images will be created in a new directory call 'dicom_deided'.
Requirements - Python, Pydicom
Install Pydicom with
pip install pydicom
Execute with
python DicomDeidentifier.py {path to DICOMs} {optional: secret key password if hashing patient ID}
Example
python DicomDeidentifier.py "C:\Dicom"
Download the free DICOM de-identifier.