Dahlem Dominik, Maniloff Diego, Ratti Carlo
IBM Research-Ireland, Dublin 15, Ireland.
Senseable City Lab, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
Sci Rep. 2015 Jul 7;5:11865. doi: 10.1038/srep11865.
The ability to intervene in disease progression given a person's disease history has the potential to solve one of society's most pressing issues: advancing health care delivery and reducing its cost. Controlling disease progression is inherently associated with the ability to predict possible future diseases given a patient's medical history. We invoke an information-theoretic methodology to quantify the level of predictability inherent in disease histories of a large electronic health records dataset with over half a million patients. In our analysis, we progress from zeroth order through temporal informed statistics, both from an individual patient's standpoint and also considering the collective effects. Our findings confirm our intuition that knowledge of common disease progressions results in higher predictability bounds than treating disease histories independently. We complement this result by showing the point at which the temporal dependence structure vanishes with increasing orders of the time-correlated statistic. Surprisingly, we also show that shuffling individual disease histories only marginally degrades the predictability bounds. This apparent contradiction with respect to the importance of time-ordered information is indicative of the complexities involved in capturing the health-care process and the difficulties associated with utilising this information in universal prediction algorithms.
根据个人疾病史干预疾病进展的能力,有可能解决社会最紧迫的问题之一:推进医疗保健服务并降低其成本。控制疾病进展本质上与根据患者病史预测未来可能疾病的能力相关。我们采用一种信息论方法,对一个拥有超过五十万患者的大型电子健康记录数据集的疾病史中固有的可预测性水平进行量化。在我们的分析中,我们从零阶开始,通过时间信息统计进行分析,既从个体患者的角度,也考虑集体效应。我们的研究结果证实了我们的直觉,即了解常见疾病进展比独立处理疾病史能带来更高的可预测性界限。我们通过展示时间相关统计量的阶数增加时时间依赖结构消失的点来补充这一结果。令人惊讶的是,我们还表明打乱个体疾病史只会略微降低可预测性界限。这种关于时间顺序信息重要性的明显矛盾表明了捕捉医疗保健过程所涉及的复杂性以及在通用预测算法中利用这些信息所面临的困难。