Strayer Nick, Vessels Tess, Choi Karmel, Zhang Siwei, Li Yajing, Han Lide, Sharber Brian, Hsi Ryan S, Bejan Cosmin A, Bick Alexander G, Balko Justin M, Johnson Douglas B, Wheless Lee E, Wells Quinn S, Philips Elizabeth J, Pulley Jill M, Self Wesley H, Chen Qingxia, Hartert Tina, Wilkins Consuelo H, Savona Michael R, Shyr Yu, Roden Dan M, Smoller Jordan W, Ruderfer Douglas M, Xu Yaomin
Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA.
Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA.
medRxiv. 2024 May 27:2024.03.28.24305045. doi: 10.1101/2024.03.28.24305045.
Electronic health records (EHR) are increasingly used for studying multimorbidities. However, concerns about accuracy, completeness, and EHRs being primarily designed for billing and administrative purposes raise questions about the consistency and reproducibility of EHR-based multimorbidity research.
Utilizing phecodes to represent the disease phenome, we analyzed pairwise comorbidity strengths using a dual logistic regression approach and constructed multimorbidity as an undirected weighted graph. We assessed the consistency of the multimorbidity networks within and between two major EHR systems at local (nodes and edges), meso (neighboring patterns), and global (network statistics) scales. We present case studies to identify disease clusters and uncover clinically interpretable disease relationships. We provide an interactive web tool and a knowledge base combining data from multiple sources for online multimorbidity analysis.
Analyzing data from 500,000 patients across Vanderbilt University Medical Center and Mass General Brigham health systems, we observed a strong correlation in disease frequencies (Kendall's τ = 0.643) and comorbidity strengths (Pearson ρ = 0.79). Consistent network statistics across EHRs suggest similar structures of multimorbidity networks at various scales. Comorbidity strengths and similarities of multimorbidity connection patterns align with the disease genetic correlations. Graph-theoretic analyses revealed a consistent core-periphery structure, implying efficient network clustering through threshold graph construction. Using hydronephrosis as a case study, we demonstrated the network's ability to uncover clinically relevant disease relationships and provide novel insights.
Our findings demonstrate the robustness of large-scale EHR data for studying phenome-wide multimorbidities. The alignment of multimorbidity patterns with genetic data suggests the potential utility for uncovering shared biology of diseases. The consistent core-periphery structure offers analytical insights to discover complex disease interactions. This work also sets the stage for advanced disease modeling, with implications for precision medicine.
VUMC Biostatistics Development Award, the National Institutes of Health, and the VA CSRD.
电子健康记录(EHR)越来越多地用于研究多种疾病。然而,对准确性、完整性以及EHR主要用于计费和管理目的的担忧,引发了关于基于EHR的多种疾病研究的一致性和可重复性的问题。
利用phecode来表示疾病表型组,我们使用双逻辑回归方法分析成对共病强度,并将多种疾病构建为无向加权图。我们在局部(节点和边)、中观(相邻模式)和全局(网络统计)尺度上评估了两个主要EHR系统内部和之间多种疾病网络的一致性。我们展示案例研究以识别疾病集群并揭示临床上可解释的疾病关系。我们提供了一个交互式网络工具和一个结合多个来源数据的知识库,用于在线多种疾病分析。
分析范德比尔特大学医学中心和麻省总医院布莱根健康系统的500,000名患者的数据,我们观察到疾病频率(肯德尔τ = 0.643)和共病强度(皮尔逊ρ = 0.79)之间存在强相关性。EHR之间一致的网络统计表明不同尺度下多种疾病网络结构相似。共病强度和多种疾病连接模式的相似性与疾病遗传相关性一致。图论分析揭示了一致的核心-外围结构,这意味着通过阈值图构建可实现有效的网络聚类。以肾盂积水为例进行案例研究,我们展示了该网络揭示临床相关疾病关系并提供新见解的能力。
我们的研究结果证明了大规模EHR数据在研究全表型多种疾病方面的稳健性。多种疾病模式与遗传数据的一致性表明其在揭示疾病共享生物学方面的潜在效用。一致的核心-外围结构为发现复杂疾病相互作用提供了分析见解。这项工作也为高级疾病建模奠定了基础,对精准医学具有重要意义。
范德比尔特大学医学中心生物统计学发展奖、美国国立卫生研究院和退伍军人事务部临床与系统研究部。