Wang Yuanjia, Wu Peng, Liu Ying, Weng Chunhua, Zeng Donglin
Department of Biostatistics, Columbia University.
Department of Biomedical Informatics, Columbia University.
Proc (IEEE Int Conf Healthc Inform). 2016 Oct;2016:65-71. doi: 10.1109/ICHI.2016.13. Epub 2016 Dec 8.
Medical research is experiencing a paradigm shift from "one-size-fits-all" strategy to a precision medicine approach where the right therapy, for the right patient, and at the right time, will be prescribed. We propose a statistical method to estimate the optimal individualized treatment rules (ITRs) that are tailored according to subject-specific features using electronic health records (EHR) data. Our approach merges statistical modeling and medical domain knowledge with machine learning algorithms to assist personalized medical decision making using EHR. We transform the estimation of optimal ITR into a classification problem and account for the non-experimental features of the EHR data and confounding by clinical indication. We create a broad range of feature variables that reflect both patient health status and healthcare data collection process. Using EHR data collected at Columbia University clinical data warehouse, we construct a decision tree for choosing the best second line therapy for treating type 2 diabetes patients.
医学研究正经历从“一刀切”策略到精准医学方法的范式转变,即在正确的时间为合适的患者开具正确的治疗方案。我们提出一种统计方法,用于估计最优个体化治疗规则(ITR),该规则根据个体特征,利用电子健康记录(EHR)数据量身定制。我们的方法将统计建模、医学领域知识与机器学习算法相结合,以辅助利用EHR进行个性化医疗决策。我们将最优ITR的估计转化为一个分类问题,并考虑EHR数据的非实验性特征以及临床指征导致的混杂因素。我们创建了一系列广泛的特征变量,这些变量既反映患者健康状况,又反映医疗数据收集过程。利用在哥伦比亚大学临床数据仓库收集的EHR数据,我们构建了一棵决策树,用于为2型糖尿病患者选择最佳二线治疗方案。