Department of Medical Sciences, Graduate School of The Catholic University of Korea, Seoul, Korea.
Department of Medical Informatics, College of Medicine, The Catholic University of Korea, Seoul, Korea.
Cancer Res Treat. 2024 Oct;56(4):1027-1039. doi: 10.4143/crt.2024.146. Epub 2024 Jun 7.
In 2024, medical researchers in the Republic of Korea were invited to amend the health and medical data utilization guidelines (Government Publications Registration Number: 11-1352000-0052828-14). This study aimed to show the overall impact of the guideline revision, with a focus on clinical genomic data.
This study amended the pseudonymization of genomic data defined in the previous version through a joint study led by the Ministry of Health and Welfare, the Korea Health Information Service, and the Korea Genome Organization. To develop the previous version, we held three conferences with four main medical research institutes and seven academic societies. We conducted two surveys targeting special genome experts in academia, industry, and institutes.
We found that cases of pseudonymization in the application of genome data were rare and that there was ambiguity in the terminology used in the previous version of the guidelines. Most experts (>~90%) agreed that the 'reserved' condition should be eliminated to make genomic data available after pseudonymization. In this study, the scope of genomic data was defined as clinical next-generation sequencing data, including FASTQ, BAM/SAM, VCF, and medical records. Pseudonymization targets genomic sequences and metadata, embedding specific elements, such as germline mutations, short tandem repeats, single-nucleotide polymorphisms, and identifiable data (for example, ID or environmental values). Expression data generated from multi-omics can be used without pseudonymization.
This amendment will not only enhance the safe use of healthcare data but also promote advancements in disease prevention, diagnosis, and treatment.
2024 年,韩国医学研究人员受邀修订健康与医疗数据利用指南(政府出版物登记号:11-1352000-0052828-14)。本研究旨在展示指南修订的总体影响,重点关注临床基因组数据。
本研究通过由保健福祉部、韩国健康信息服务和韩国基因组组织共同牵头的联合研究,对前一版本中定义的基因组数据假名化进行了修订。为了制定前一版本,我们与四家主要医学研究机构和七个学术协会举行了三次会议。我们还针对学术界、产业界和研究所的特殊基因组专家进行了两次调查。
我们发现基因组数据应用中的假名化案例很少,并且指南前一版本中使用的术语存在歧义。大多数专家(~>90%)认为,应消除“保留”条件,以便在假名化后可利用基因组数据。在本研究中,基因组数据的范围被定义为包括 FASTQ、BAM/SAM、VCF 和医疗记录在内的临床下一代测序数据。假名化的目标是基因组序列和元数据,嵌入特定元素,如种系突变、短串联重复、单核苷酸多态性和可识别数据(例如 ID 或环境值)。无需假名化即可使用多组学生成的表达数据。
此次修订不仅将增强医疗保健数据的安全使用,还有助于推进疾病预防、诊断和治疗的进展。