IBM Research, Beijing, China.
IBM Research, Beijing, China.
J Biomed Inform. 2021 Mar;115:103686. doi: 10.1016/j.jbi.2021.103686. Epub 2021 Jan 23.
As Electronic Health Records (EHR) data accumulated explosively in recent years, the tremendous amount of patient clinical data provided opportunities to discover real world evidence. In this study, a graphical disease network, named progressive cardiovascular disease network (progCDN), was built to delineate the progression profiles of cardiovascular diseases (CVD).
The EHR data of 14.3 million patients with CVD diagnoses were collected for building disease network and further analysis. We applied a new designed method, progression rates (PR), to calculate the progression relationship among different diagnoses. Based on the disease network outcome, 23 disease progression pair were selected to screen for salient features.
The network depicted the dominant diseases in CVD development, such as the heart failure and coronary arteriosclerosis. Novel progression relationships were also discovered, such as the progression path from long QT syndrome to major depression. In addition, three age-group progCDNs identified a series of age-associated disease progression paths and important successor diseases with age bias. Furthermore, a list of important features with sufficient abundance and high correlation was extracted for building disease risk models.
The PR method designed for identifying the progression relationship could be widely applied in any EHR database due to its flexibility and robust functionality. Meanwhile, researchers could use the progCDN network to validate or explore novel disease relationships in real world data.
The first-time interrogation of such a huge CVD patients cohort enabled us to explore the general and age-specific disease progression patterns in CVD development.
随着近年来电子健康记录(EHR)数据的爆炸式增长,大量患者临床数据为发现真实世界证据提供了机会。在这项研究中,构建了一个图形疾病网络,命名为进展性心血管疾病网络(progCDN),以描绘心血管疾病(CVD)的进展情况。
收集了 1430 万例 CVD 诊断患者的 EHR 数据用于构建疾病网络和进一步分析。我们应用了一种新设计的方法,即进展率(PR),来计算不同诊断之间的进展关系。基于疾病网络的结果,选择了 23 对疾病进展对来筛选显著特征。
该网络描绘了 CVD 发展中的主要疾病,如心力衰竭和冠状动脉粥样硬化。还发现了新的进展关系,如长 QT 综合征向重度抑郁症的进展路径。此外,三个年龄组的 progCDN 确定了一系列与年龄相关的疾病进展路径和具有年龄偏向的重要后继疾病。此外,还提取了一系列具有足够丰度和高相关性的重要特征,用于构建疾病风险模型。
由于其灵活性和稳健功能,为识别进展关系而设计的 PR 方法可广泛应用于任何 EHR 数据库。同时,研究人员可以使用 progCDN 网络在真实世界数据中验证或探索新的疾病关系。
首次对如此庞大的 CVD 患者队列进行调查,使我们能够探索 CVD 发展中一般和特定年龄的疾病进展模式。