大规模发现疾病-疾病和疾病-基因关联。

Large-Scale Discovery of Disease-Disease and Disease-Gene Associations.

机构信息

Center for Data Analytics and Biomedical Informatics, Temple University, Philadelphia, PA 19122 USA.

Department of Biology, Temple University, Philadelphia, PA 19122 USA.

出版信息

Sci Rep. 2016 Aug 31;6:32404. doi: 10.1038/srep32404.

DOI:10.1038/srep32404

PMID:27578529

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5006166/

Abstract

Data-driven phenotype analyses on Electronic Health Record (EHR) data have recently drawn benefits across many areas of clinical practice, uncovering new links in the medical sciences that can potentially affect the well-being of millions of patients. In this paper, EHR data is used to discover novel relationships between diseases by studying their comorbidities (co-occurrences in patients). A novel embedding model is designed to extract knowledge from disease comorbidities by learning from a large-scale EHR database comprising more than 35 million inpatient cases spanning nearly a decade, revealing significant improvements on disease phenotyping over current computational approaches. In addition, the use of the proposed methodology is extended to discover novel disease-gene associations by including valuable domain knowledge from genome-wide association studies. To evaluate our approach, its effectiveness is compared against a held-out set where, again, it revealed very compelling results. For selected diseases, we further identify candidate gene lists for which disease-gene associations were not studied previously. Thus, our approach provides biomedical researchers with new tools to filter genes of interest, thus, reducing costly lab studies.

摘要

基于电子健康记录 (EHR) 数据的数据分析在医学实践的许多领域中最近得到了广泛应用，揭示了医学科学中的新联系，这些联系可能会影响数百万患者的健康。在本文中，我们通过研究疾病的合并症（患者中的共同出现），利用 EHR 数据发现疾病之间的新关系。设计了一种新的嵌入模型，通过从一个包含超过 3500 万住院病例的大型 EHR 数据库中学习，从疾病合并症中提取知识，与当前的计算方法相比，在疾病表型分析方面取得了显著的改进。此外，通过纳入全基因组关联研究的有价值的领域知识，将所提出的方法扩展到发现新的疾病-基因关联。为了评估我们的方法，将其有效性与一个保留集进行了比较，结果同样非常引人注目。对于选定的疾病，我们进一步确定了候选基因列表，这些基因之前没有研究过与疾病的关联。因此，我们的方法为生物医学研究人员提供了新的工具来筛选感兴趣的基因，从而减少昂贵的实验室研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fb8/5006166/2c1c7d486386/srep32404-f1.jpg

相似文献

Large-Scale Discovery of Disease-Disease and Disease-Gene Associations.

Sci Rep. 2016 Aug 31;6:32404. doi: 10.1038/srep32404.

IDENTIFYING GENETIC ASSOCIATIONS WITH VARIABILITY IN METABOLIC HEALTH AND BLOOD COUNT LABORATORY VALUES: DIVING INTO THE QUANTITATIVE TRAITS BY LEVERAGING LONGITUDINAL DATA FROM AN EHR.

Pac Symp Biocomput. 2017;22:533-544. doi: 10.1142/9789813207813_0049.

Disease types discovery from a large database of inpatient records: A sepsis study.

Methods. 2016 Dec 1;111:45-55. doi: 10.1016/j.ymeth.2016.07.021. Epub 2016 Jul 28.

OPENING THE DOOR TO THE LARGE SCALE USE OF CLINICAL LAB MEASURES FOR ASSOCIATION TESTING: EXPLORING DIFFERENT METHODS FOR DEFINING PHENOTYPES.

Pac Symp Biocomput. 2017;22:356-367. doi: 10.1142/9789813207813_0034.

Next-generation analysis of cataracts: determining knowledge driven gene-gene interactions using biofilter, and gene-environment interactions using the Phenx Toolkit*.

Pac Symp Biocomput. 2015:495-505.

Translating genome wide association study results to associations among common diseases: in silico study with an electronic medical record.

Int J Med Inform. 2013 Sep;82(9):864-74. doi: 10.1016/j.ijmedinf.2013.05.003. Epub 2013 Jun 3.

Detecting time-evolving phenotypic topics via tensor factorization on electronic health records: Cardiovascular disease case study.

J Biomed Inform. 2019 Oct;98:103270. doi: 10.1016/j.jbi.2019.103270. Epub 2019 Aug 22.

Disease Heritability Inferred from Familial Relationships Reported in Medical Records.

Cell. 2018 Jun 14;173(7):1692-1704.e11. doi: 10.1016/j.cell.2018.04.032. Epub 2018 May 17.

Electronic health records: the next wave of complex disease genetics.

Hum Mol Genet. 2018 May 1;27(R1):R14-R21. doi: 10.1093/hmg/ddy081.

A machine learning-based framework to identify type 2 diabetes through electronic health records.

Int J Med Inform. 2017 Jan;97:120-127. doi: 10.1016/j.ijmedinf.2016.09.014. Epub 2016 Oct 1.

引用本文的文献

Computational strategies for cross-species knowledge transfer and translational biomedicine.

ArXiv. 2024 Aug 16:arXiv:2408.08503v1.

The Role and Applications of Artificial Intelligence in the Treatment of Chronic Pain.

Curr Pain Headache Rep. 2024 Aug;28(8):769-784. doi: 10.1007/s11916-024-01264-0. Epub 2024 Jun 1.

Exploring novel disease-disease associations based on multi-view fusion network.

Comput Struct Biotechnol J. 2023 Feb 24;21:1807-1819. doi: 10.1016/j.csbj.2023.02.038. eCollection 2023.

A hierarchical multilabel graph attention network method to predict the deterioration paths of chronic hepatitis B patients.

J Am Med Inform Assoc. 2023 Apr 19;30(5):846-858. doi: 10.1093/jamia/ocad008.

Regulome-based characterization of drug activity across the human diseasome.

NPJ Syst Biol Appl. 2022 Nov 7;8(1):44. doi: 10.1038/s41540-022-00255-4.

Artificial Intelligence and Cardiovascular Genetics.

Life (Basel). 2022 Feb 14;12(2):279. doi: 10.3390/life12020279.

Discovering disease-disease associations using electronic health records in The Guideline Advantage (TGA) dataset.

Sci Rep. 2021 Oct 25;11(1):20969. doi: 10.1038/s41598-021-00345-z.

Word2vec Skip-Gram Dimensionality Selection via Sequential Normalized Maximum Likelihood.

Entropy (Basel). 2021 Jul 31;23(8):997. doi: 10.3390/e23080997.

Use of disease embedding technique to predict the risk of progression to end-stage renal disease.

J Biomed Inform. 2020 May;105:103409. doi: 10.1016/j.jbi.2020.103409. Epub 2020 Apr 15.

Linking glycemic dysregulation in diabetes to symptoms, comorbidities, and genetics through EHR data mining.

Elife. 2019 Dec 10;8:e44941. doi: 10.7554/eLife.44941.

本文引用的文献

Phenome-Wide Association Studies as a Tool to Advance Precision Medicine.

Annu Rev Genomics Hum Genet. 2016 Aug 31;17:353-73. doi: 10.1146/annurev-genom-090314-024956. Epub 2016 May 4.

Standardized phenotyping enhances Mendelian disease gene identification.

Nat Genet. 2015 Nov;47(11):1222-4. doi: 10.1038/ng.3425.

Methods for biological data integration: perspectives and challenges.

J R Soc Interface. 2015 Nov 6;12(112). doi: 10.1098/rsif.2015.0571.

A DIseAse MOdule Detection (DIAMOnD) algorithm derived from a systematic analysis of connectivity patterns of disease proteins in the human interactome.

PLoS Comput Biol. 2015 Apr 8;11(4):e1004120. doi: 10.1371/journal.pcbi.1004120. eCollection 2015 Apr.

Building bridges across electronic health record systems through inferred phenotypic topics.

J Biomed Inform. 2015 Jun;55:82-93. doi: 10.1016/j.jbi.2015.03.011. Epub 2015 Apr 1.

Disease networks. Uncovering disease-disease relationships through the incomplete interactome.

Science. 2015 Feb 20;347(6224):1257601. doi: 10.1126/science.1257601.

Predicting disease associations via biological network analysis.

BMC Bioinformatics. 2014 Sep 17;15(1):304. doi: 10.1186/1471-2105-15-304.

The integrated disease network.

Integr Biol (Camb). 2014 Nov;6(11):1069-79. doi: 10.1039/c4ib00122b.

Limestone: high-throughput candidate phenotype generation via tensor factorization.

J Biomed Inform. 2014 Dec;52:199-211. doi: 10.1016/j.jbi.2014.07.001. Epub 2014 Jul 16.

Discovering disease-disease associations by fusing systems-level molecular data.

Sci Rep. 2013 Nov 15;3:3202. doi: 10.1038/srep03202.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

大规模发现疾病-疾病和疾病-基因关联。

Large-Scale Discovery of Disease-Disease and Disease-Gene Associations.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献