电子病历表型细节对基因关联研究的影响：以高密度脂蛋白胆固醇为例

The effects of electronic medical record phenotyping details on genetic association studies: HDL-C as a case study.

作者信息

Dumitrescu Logan, Goodloe Robert, Bradford Yukiko, Farber-Eger Eric, Boston Jonathan, Crawford Dana C

机构信息

Center for Human Genetics Research, Vanderbilt University, 2215 Garland Avenue, 519 Light Hall, Nashville, TN 37232 USA ; Department of Molecular Physiology and Biophysics, Vanderbilt University, 2215 Garland Avenue, 519 Light Hall, Nashville, TN 37232 USA.

Center for Systems Genomics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, 512 Wartik Laboratory, University Park, PA 16802 USA.

出版信息

BioData Min. 2015 May 6;8:15. doi: 10.1186/s13040-015-0048-2. eCollection 2015.

DOI:10.1186/s13040-015-0048-2

PMID:25969697

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4428098/

Abstract

BACKGROUND

Biorepositories linked to de-identified electronic medical records (EMRs) have the potential to complement traditional epidemiologic studies in genotype-phenotype studies of complex human diseases and traits. A major challenge in meeting this potential is the use of EMR-derived data to extract phenotypes and covariates for genetic association studies. Unlike traditional epidemiologic data, EMR-derived data are collected for clinical care and are therefore highly variable across patients. The variability of clinical data coupled with the challenges associated with searching unstructured clinical notes requires the development of algorithms to extract phenotypes for analysis. Given the number of possible algorithms that could be developed for any one EMR-derived phenotype, we explored here the impact algorithm decision logic has on genetic association study results for a single quantitative trait, high density lipoprotein cholesterol (HDL-C).

RESULTS

We used five different algorithms to extract HDL-C from African American subjects genotyped on the Illumina Metabochip (n = 11,519) as part of Epidemiologic Architecture for Genes Linked to Environment (EAGLE). Tests of association between HDL-C and genetic risk scores for HDL-C associated variants suggest that the genetic effect size does not vary substantially across the five HDL-C definitions.

CONCLUSIONS

These data collectively suggest that, at least for this quantitative trait, algorithm decision logic and phenotyping details do not appreciably impact genetic association study test statistics.

摘要

背景

与去识别化电子病历（EMR）相关联的生物样本库有潜力在复杂人类疾病和性状的基因型-表型研究中补充传统流行病学研究。实现这一潜力的一个主要挑战是利用EMR衍生数据提取用于基因关联研究的表型和协变量。与传统流行病学数据不同，EMR衍生数据是为临床护理而收集的，因此在患者之间高度可变。临床数据的变异性以及搜索非结构化临床记录相关的挑战需要开发算法来提取表型进行分析。鉴于针对任何一种EMR衍生表型都可能开发出大量算法，我们在此探讨算法决策逻辑对单一数量性状高密度脂蛋白胆固醇（HDL-C）的基因关联研究结果的影响。

结果

作为与环境相关基因的流行病学架构（EAGLE）的一部分，我们使用五种不同算法从在Illumina Metabochip上进行基因分型的非裔美国受试者（n = 11,519）中提取HDL-C。HDL-C与HDL-C相关变体的遗传风险评分之间的关联测试表明，在五种HDL-C定义中，遗传效应大小没有显著差异。

结论

这些数据共同表明，至少对于这个数量性状，算法决策逻辑和表型细节不会明显影响基因关联研究的检验统计量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/966d/4428098/10dd31824b29/13040_2015_48_Fig1_HTML.jpg

相似文献

The effects of electronic medical record phenotyping details on genetic association studies: HDL-C as a case study.电子病历表型细节对基因关联研究的影响：以高密度脂蛋白胆固醇为例

BioData Min. 2015 May 6;8:15. doi: 10.1186/s13040-015-0048-2. eCollection 2015.

Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network.基于电子病历的表型算法验证：eMERGE 网络的结果和经验教训。

J Am Med Inform Assoc. 2013 Jun;20(e1):e147-54. doi: 10.1136/amiajnl-2012-000896. Epub 2013 Mar 26.

Extracting Primary Open-Angle Glaucoma from Electronic Medical Records for Genetic Association Studies.从电子病历中提取原发性开角型青光眼用于基因关联研究。

PLoS One. 2015 Jun 10;10(6):e0127817. doi: 10.1371/journal.pone.0127817. eCollection 2015.

Semi-supervised validation of multiple surrogate outcomes with application to electronic medical records phenotyping.多替代结局的半监督验证及其在电子病历表型分析中的应用

Biometrics. 2019 Mar;75(1):78-89. doi: 10.1111/biom.12971. Epub 2019 Mar 8.

The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies.eMERGE 网络：一个由生物库组成的联盟，与电子病历数据相关联，用于进行基因组研究。

BMC Med Genomics. 2011 Jan 26;4:13. doi: 10.1186/1755-8794-4-13.

Towards a phenome-wide catalog of human clinical traits impacted by genetic ancestry.朝着构建受遗传血统影响的人类临床特征的全表型目录迈进。

BioData Min. 2015 Nov 11;8:35. doi: 10.1186/s13040-015-0068-y. eCollection 2015.

Development of a data-mining algorithm to identify ages at reproductive milestones in electronic medical records.开发一种数据挖掘算法，以识别电子病历中生殖里程碑的年龄。

Pac Symp Biocomput. 2014:376-87.

Improving the power of genetic association tests with imperfect phenotype derived from electronic medical records.利用源自电子病历的不完美表型提高基因关联测试的效能。

Hum Genet. 2014 Nov;133(11):1369-82. doi: 10.1007/s00439-014-1466-9. Epub 2014 Jul 26.

Electronic Medical Record-Based Case Phenotyping for the Charlson Conditions: Scoping Review.基于电子病历的查尔森合并症病例表型分析：范围综述

JMIR Med Inform. 2021 Feb 1;9(2):e23934. doi: 10.2196/23934.

A Validated Phenotyping Algorithm for Genetic Association Studies in Age-related Macular Degeneration.一种用于年龄相关性黄斑变性基因关联研究的经过验证的表型分析算法。

Sci Rep. 2015 Aug 10;5:12875. doi: 10.1038/srep12875.

引用本文的文献

Assessment of multi-population polygenic risk scores for lipid traits in African Americans.评估非洲裔美国人脂质特征的多人群多基因风险评分。

PeerJ. 2023 May 16;11:e14910. doi: 10.7717/peerj.14910. eCollection 2023.

Ensuring electronic medical record simulation through better training, modeling, and evaluation.通过更好的培训、建模和评估来确保电子病历模拟。

J Am Med Inform Assoc. 2020 Jan 1;27(1):99-108. doi: 10.1093/jamia/ocz161.

Secondary Use and Analysis of Big Data Collected for Patient Care.用于患者护理的大数据的二次利用与分析。

Yearb Med Inform. 2017 Aug;26(1):28-37. doi: 10.15265/IY-2017-008. Epub 2017 Sep 11.

Longitudinal SNP-set association analysis of quantitative phenotypes.定量表型的纵向单核苷酸多态性集关联分析。

Genet Epidemiol. 2017 Jan;41(1):81-93. doi: 10.1002/gepi.22016. Epub 2016 Nov 9.

Clinical Research Informatics Contributions from 2015.2015年临床研究信息学的贡献。

Yearb Med Inform. 2016 Nov 10(1):219-223. doi: 10.15265/IY-2016-044.

THE CHALLENGES IN USING ELECTRONIC HEALTH RECORDS FOR PHARMACOGENOMICS AND PRECISION MEDICINE RESEARCH.在药物基因组学和精准医学研究中使用电子健康记录所面临的挑战。

Pac Symp Biocomput. 2016;21:369-80.

Advances in the Study of the Antiatherogenic Function and Novel Therapies for HDL.高密度脂蛋白的抗动脉粥样硬化功能及新型疗法的研究进展

Int J Mol Sci. 2015 Jul 28;16(8):17245-72. doi: 10.3390/ijms160817245.

本文引用的文献

Leveraging Epidemiologic and Clinical Collections for Genomic Studies of Complex Traits.利用流行病学和临床数据集进行复杂性状的基因组研究。

Hum Hered. 2015;79(3-4):137-46. doi: 10.1159/000381805. Epub 2015 Jul 28.

Bringing genome-wide association findings into clinical use.将全基因组关联研究结果应用于临床实践。

Nat Rev Genet. 2013 Aug;14(8):549-58. doi: 10.1038/nrg3523. Epub 2013 Jul 9.

Enhancing the power of genetic association studies through the use of silver standard cases derived from electronic medical records.通过使用源自电子病历的银标准病例来增强遗传关联研究的效力。

PLoS One. 2013 Jun 10;8(6):e63481. doi: 10.1371/journal.pone.0063481. Print 2013.

The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future.电子病历与基因组学（eMERGE）网络：过去、现在和未来。

Genet Med. 2013 Oct;15(10):761-71. doi: 10.1038/gim.2013.72. Epub 2013 Jun 6.

Genetic variants that confer resistance to malaria are associated with red blood cell traits in African-Americans: an electronic medical record-based genome-wide association study.与非洲裔美国人红细胞特征相关的疟疾抗性遗传变异：一项基于电子病历的全基因组关联研究。

G3 (Bethesda). 2013 Jul 8;3(7):1061-8. doi: 10.1534/g3.113.006452.

Trans-ethnic fine-mapping of lipid loci identifies population-specific signals and allelic heterogeneity that increases the trait variance explained.跨种族脂质基因座精细定位确定了特定人群的信号和等位基因异质性，从而增加了可解释的性状方差。

PLoS Genet. 2013 Mar;9(3):e1003379. doi: 10.1371/journal.pgen.1003379. Epub 2013 Mar 21.

Genome- and phenome-wide analyses of cardiac conduction identifies markers of arrhythmia risk.基因组和表型全基因组分析发现心脏传导标志物与心律失常风险相关。

Circulation. 2013 Apr 2;127(13):1377-85. doi: 10.1161/CIRCULATIONAHA.112.000604. Epub 2013 Mar 5.

Next-generation analysis of cataracts: determining knowledge driven gene-gene interactions using Biofilter, and gene-environment interactions using the PhenX Toolkit.白内障的下一代分析：使用生物过滤器确定知识驱动的基因-基因相互作用，以及使用PhenX工具包确定基因-环境相互作用。

Pac Symp Biocomput. 2013:147-58.

Genetic variation associated with circulating monocyte count in the eMERGE Network.与 eMERGE 网络中循环单核细胞计数相关的遗传变异。

Hum Mol Genet. 2013 May 15;22(10):2119-27. doi: 10.1093/hmg/ddt010. Epub 2013 Jan 12.

High density GWAS for LDL cholesterol in African Americans using electronic medical records reveals a strong protective variant in APOE.利用电子病历对非裔美国人的 LDL 胆固醇进行高密度 GWAS 分析，揭示了 APOE 中的一个强保护性变异。

Clin Transl Sci. 2012 Oct;5(5):394-9. doi: 10.1111/j.1752-8062.2012.00446.x. Epub 2012 Aug 23.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

电子病历表型细节对基因关联研究的影响：以高密度脂蛋白胆固醇为例

The effects of electronic medical record phenotyping details on genetic association studies: HDL-C as a case study.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献