Electronic medical records are being hailed as a tool to aggregate patient data and advance research, but questions remain about how the vast sharing and compiling of this critical medical/genetic information will remain de-identified to protect patients’ privacy and security.

Researchers at Vanderbilt University have found a unique algorithm to make electronic medical record information anonymous for genome-wide association studies (GWAS), according to a paper that recently appeared in the Proceedings of the National Academy of Sciences.

Their approach is called Utility Guided Anonymization of Clinical Profiles (UGACLIP), and it involves generalizing some of the diagnostic information from electronic medical records to make it more abstract and anonymous. The method was validated using data from nearly 3,000 patient electronic medical records from the Vanderbilt University Medical Center.

The challenge with this project was to protect patient privacy and also maintain the associations between genomic sequences and specific sets of clinical features that correspond to GWAS. In genetic research, there could be serious threats to patient security. The Vanderbilt research team found that in 3,000 patients selected from over a million individuals, patients could be identified based on their combination of diagnostic codes nearly 97 percent of the time. The Vanderbilt study, funded by the National Human Genome Research Institute and the National Library of Medicine, identified a method of automatically extracting clinical features that could potentially be linked back to the patient, and it changes them so that they are not linkable to genomic sequences of patients.

The team tested this UGACLIP algorithm on two real patient data sets from Vanderbilt University Medical Center’s electronic medical records system and found that the algorithm did decrease the risk of linking an individual to their GWAS data while also maintaining much of the clinical and diagnostic information needed to permit data sharing and follow up studies. Building upon this success, Vanderbilt researchers are continuing to improve the algorithm and plan to develop software so that other investigators can safely and securely extract electronic medical record data in future GWAS.

It is reassuring to see the research community focusing on the safety and security of electronic medical record data. Hopefully this exploration into maintaining the security of electronic medical record data will extend beyond GWAS into other fields of medical research.

To learn more, see the original article in GenomeWeb Daily News