National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland.
Center for Precision Medicine, Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee.
JAMA Oncol. 2022 Jun 1;8(6):835-844. doi: 10.1001/jamaoncol.2022.0373.
Knowledge about the spectrum of diseases associated with hereditary cancer syndromes may improve disease diagnosis and management for patients and help to identify high-risk individuals.
To identify phenotypes associated with hereditary cancer genes through a phenome-wide association study.
DESIGN, SETTING, AND PARTICIPANTS: This phenome-wide association study used health data from participants in 3 cohorts. The Electronic Medical Records and Genomics Sequencing (eMERGEseq) data set recruited predominantly healthy individuals from 10 US medical centers from July 16, 2016, through February 18, 2018, with a mean follow-up through electronic health records (EHRs) of 12.7 (7.4) years. The UK Biobank (UKB) cohort recruited participants from March 15, 2006, through August 1, 2010, with a mean (SD) follow-up of 12.4 (1.0) years. The Hereditary Cancer Registry (HCR) recruited patients undergoing clinical genetic testing at Vanderbilt University Medical Center from May 1, 2012, through December 31, 2019, with a mean (SD) follow-up through EHRs of 8.8 (6.5) years.
Germline variants in 23 hereditary cancer genes. Pathogenic and likely pathogenic variants for each gene were aggregated for association analyses.
Phenotypes in the eMERGEseq and HCR cohorts were derived from the linked EHRs. Phenotypes in UKB were from multiple sources of health-related data.
A total of 214 020 participants were identified, including 23 544 in eMERGEseq cohort (mean [SD] age, 47.8 [23.7] years; 12 611 women [53.6%]), 187 234 in the UKB cohort (mean [SD] age, 56.7 [8.1] years; 104 055 [55.6%] women), and 3242 in the HCR cohort (mean [SD] age, 52.5 [15.5] years; 2851 [87.9%] women). All 38 established gene-cancer associations were replicated, and 19 new associations were identified. These included the following 7 associations with neoplasms: CHEK2 with leukemia (odds ratio [OR], 3.81 [95% CI, 2.64-5.48]) and plasma cell neoplasms (OR, 3.12 [95% CI, 1.84-5.28]), ATM with gastric cancer (OR, 4.27 [95% CI, 2.35-7.44]) and pancreatic cancer (OR, 4.44 [95% CI, 2.66-7.40]), MUTYH (biallelic) with kidney cancer (OR, 32.28 [95% CI, 6.40-162.73]), MSH6 with bladder cancer (OR, 5.63 [95% CI, 2.75-11.49]), and APC with benign liver/intrahepatic bile duct tumors (OR, 52.01 [95% CI, 14.29-189.29]). The remaining 12 associations with nonneoplastic diseases included BRCA1/2 with ovarian cysts (OR, 3.15 [95% CI, 2.22-4.46] and 3.12 [95% CI, 2.36-4.12], respectively), MEN1 with acute pancreatitis (OR, 33.45 [95% CI, 9.25-121.02]), APC with gastritis and duodenitis (OR, 4.66 [95% CI, 2.61-8.33]), and PTEN with chronic gastritis (OR, 15.68 [95% CI, 6.01-40.92]).
The findings of this genetic association study analyzing the EHRs of 3 large cohorts suggest that these new phenotypes associated with hereditary cancer genes may facilitate early detection and better management of cancers. This study highlights the potential benefits of using EHR data in genomic medicine.
重要性:了解与遗传性癌症综合征相关的疾病谱可能有助于改善患者的疾病诊断和管理,并有助于识别高危个体。
目的:通过全基因组关联研究确定与遗传性癌症基因相关的表型。
设计、地点和参与者:本全基因组关联研究使用了来自 3 个队列的参与者的健康数据。电子病历和基因组测序(eMERGEseq)数据集主要招募了来自美国 10 个医疗中心的健康个体,招募时间为 2016 年 7 月 16 日至 2018 年 2 月 18 日,通过电子健康记录(EHR)的平均随访时间为 12.7(7.4)年。英国生物库(UKB)队列于 2006 年 3 月 15 日至 2010 年 8 月 1 日招募参与者,平均(SD)随访时间为 12.4(1.0)年。遗传性癌症登记处(HCR)于 2012 年 5 月 1 日至 2019 年 12 月 31 日在范德比尔特大学医学中心招募接受临床基因检测的患者,通过 EHR 的平均(SD)随访时间为 8.8(6.5)年。
暴露情况:23 个遗传性癌症基因中的种系变异。每个基因的致病性和可能致病性变异被聚合进行关联分析。
主要结果和措施:eMERGEseq 和 HCR 队列中的表型来自关联的 EHR。UKB 的表型来自多种健康相关数据来源。
结果:共确定了 214020 名参与者,包括 eMERGEseq 队列中的 23544 名(平均[SD]年龄为 47.8[23.7]岁;12611 名女性[53.6%]),UKB 队列中的 187234 名(平均[SD]年龄为 56.7[8.1]岁;104055 名女性[55.6%])和 HCR 队列中的 3242 名(平均[SD]年龄为 52.5[15.5]岁;2851 名[87.9%]女性)。所有 38 个已建立的基因-癌症关联均得到复制,并确定了 19 个新关联。其中包括以下 7 个与肿瘤相关的关联:CHEK2 与白血病(比值比[OR],3.81[95%CI,2.64-5.48])和浆细胞肿瘤(OR,3.12[95%CI,1.84-5.28]),ATM 与胃癌(OR,4.27[95%CI,2.35-7.44])和胰腺癌(OR,4.44[95%CI,2.66-7.40]),MUTYH(双等位基因)与肾癌(OR,32.28[95%CI,6.40-162.73]),MSH6 与膀胱癌(OR,5.63[95%CI,2.75-11.49]),APC 与良性肝/肝内胆管肿瘤(OR,52.01[95%CI,14.29-189.29])。其余 12 个与非肿瘤性疾病相关的关联包括 BRCA1/2 与卵巢囊肿(OR,3.15[95%CI,2.22-4.46]和 3.12[95%CI,2.36-4.12]),MEN1 与急性胰腺炎(OR,33.45[95%CI,9.25-121.02]),APC 与胃炎和十二指肠炎(OR,4.66[95%CI,2.61-8.33]),和 PTEN 与慢性胃炎(OR,15.68[95%CI,6.01-40.92])。
结论:这项分析了 3 个大型队列的电子病历的遗传关联研究的结果表明,这些与遗传性癌症基因相关的新表型可能有助于癌症的早期发现和更好的管理。本研究强调了在基因组医学中使用电子病历数据的潜在益处。