Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, Massachusetts.
Department of Psychiatry, Massachusetts General Hospital, Boston, Massachusetts.
Am J Med Genet B Neuropsychiatr Genet. 2018 Oct;177(7):601-612. doi: 10.1002/ajmg.b.32548. Epub 2017 May 30.
The widespread adoption of electronic health record (EHRs) in healthcare systems has created a vast and continuously growing resource of clinical data and provides new opportunities for population-based research. In particular, the linking of EHRs to biospecimens and genomic data in biobanks may help address what has become a rate-limiting study for genetic research: the need for large sample sizes. The principal roadblock to capitalizing on these resources is the need to establish the validity of phenotypes extracted from the EHR. For psychiatric genetic research, this represents a particular challenge given that diagnosis is based on patient reports and clinician observations that may not be well-captured in billing codes or narrative records. This review addresses the opportunities and pitfalls in EHR-based phenotyping with a focus on their application to psychiatric genetic research. A growing number of studies have demonstrated that diagnostic algorithms with high positive predictive value can be derived from EHRs, especially when structured data are supplemented by text mining approaches. Such algorithms enable semi-automated phenotyping for large-scale case-control studies. In addition, the scale and scope of EHR databases have been used successfully to identify phenotypic subgroups and derive algorithms for longitudinal risk prediction. EHR-based genomics are particularly well-suited to rapid look-up replication of putative risk genes, studies of pleiotropy (phenomewide association studies or PheWAS), investigations of genetic networks and overlap across the phenome, and pharmacogenomic research. EHR phenotyping has been relatively under-utilized in psychiatric genomic research but may become a key component of efforts to advance precision psychiatry.
电子健康记录(EHR)在医疗保健系统中的广泛采用,创造了一个庞大且不断增长的临床数据资源,并为基于人群的研究提供了新的机会。特别是,将 EHR 与生物库中的生物样本和基因组数据相链接,可能有助于解决遗传研究中一直以来的一个限速问题:对大样本量的需求。利用这些资源的主要障碍是需要确定从 EHR 中提取的表型的有效性。对于精神科遗传研究来说,这是一个特别的挑战,因为诊断是基于患者报告和临床医生的观察,而这些可能无法在计费代码或叙述记录中很好地捕捉到。这篇综述探讨了基于 EHR 的表型分析的机会和陷阱,重点关注其在精神科遗传研究中的应用。越来越多的研究表明,可以从 EHR 中得出具有高阳性预测值的诊断算法,尤其是当结构化数据辅以文本挖掘方法时。这些算法可以实现大规模病例对照研究的半自动表型分析。此外,EHR 数据库的规模和范围已成功用于确定表型亚组,并为纵向风险预测推导算法。基于 EHR 的基因组学特别适合快速查找复制假定的风险基因、研究多效性(表型全基因组关联研究或 PheWAS)、研究遗传网络以及表型之间的重叠,以及药物基因组学研究。基于 EHR 的表型分析在精神科基因组学研究中相对未得到充分利用,但可能成为推进精准精神病学努力的关键组成部分。