Department of Biomedical Informatics, Vanderbilt University, Nashville, TN 37232, USA.
Am J Hum Genet. 2011 Oct 7;89(4):529-42. doi: 10.1016/j.ajhg.2011.09.008.
We repurposed existing genotypes in DNA biobanks across the Electronic Medical Records and Genomics network to perform a genome-wide association study for primary hypothyroidism, the most common thyroid disease. Electronic selection algorithms incorporating billing codes, laboratory values, text queries, and medication records identified 1317 cases and 5053 controls of European ancestry within five electronic medical records (EMRs); the algorithms' positive predictive values were 92.4% and 98.5% for cases and controls, respectively. Four single-nucleotide polymorphisms (SNPs) in linkage disequilibrium at 9q22 near FOXE1 were associated with hypothyroidism at genome-wide significance, the strongest being rs7850258 (odds ratio [OR] 0.74, p = 3.96 × 10(-9)). This association was replicated in a set of 263 cases and 1616 controls (OR = 0.60, p = 5.7 × 10(-6)). A phenome-wide association study (PheWAS) that was performed on this locus with 13,617 individuals and more than 200,000 patient-years of billing data identified associations with additional phenotypes: thyroiditis (OR = 0.58, p = 1.4 × 10(-5)), nodular (OR = 0.76, p = 3.1 × 10(-5)) and multinodular (OR = 0.69, p = 3.9 × 10(-5)) goiters, and thyrotoxicosis (OR = 0.76, p = 1.5 × 10(-3)), but not Graves disease (OR = 1.03, p = 0.82). Thyroid cancer, previously associated with this locus, was not significantly associated in the PheWAS (OR = 1.29, p = 0.09). The strongest association in the PheWAS was hypothyroidism (OR = 0.76, p = 2.7 × 10(-13)), which had an odds ratio that was nearly identical to that of the curated case-control population in the primary analysis, providing further validation of the PheWAS method. Our findings indicate that EMR-linked genomic data could allow discovery of genes associated with many diseases without additional genotyping cost.
我们重新利用了电子病历和基因组学网络中的现有基因型,对原发性甲状腺功能减退症(最常见的甲状腺疾病)进行了全基因组关联研究。电子选择算法结合了计费代码、实验室值、文本查询和用药记录,在五个电子病历(EMR)中确定了 1317 例病例和 5053 例对照,这些算法对病例和对照的阳性预测值分别为 92.4%和 98.5%。在 FOXE1 附近的 9q22 处连锁不平衡的四个单核苷酸多态性(SNP)与甲状腺功能减退症在全基因组水平上显著相关,最强的是 rs7850258(比值比[OR]0.74,p=3.96×10(-9))。该关联在一组 263 例病例和 1616 例对照中得到了复制(OR=0.60,p=5.7×10(-6))。在一个包含 13617 个人和超过 200000 人年计费数据的全基因组关联研究(PheWAS)中,该基因座与其他表型相关联:甲状腺炎(OR=0.58,p=1.4×10(-5))、结节性(OR=0.76,p=3.1×10(-5))和多结节性(OR=0.69,p=3.9×10(-5))甲状腺肿,以及甲状腺毒症(OR=0.76,p=1.5×10(-3)),但 Graves 病(OR=1.03,p=0.82)除外。先前与该基因座相关的甲状腺癌在 PheWAS 中没有显著相关性(OR=1.29,p=0.09)。PheWAS 中最强的关联是甲状腺功能减退症(OR=0.76,p=2.7×10(-13)),其比值比与主要分析中经过精心整理的病例对照人群几乎相同,进一步验证了 PheWAS 方法。我们的研究结果表明,电子病历相关的基因组数据可以在不增加额外基因分型成本的情况下,发现与许多疾病相关的基因。