Alexander Joe, Edwards Roger A, Manca Luigi, Grugni Roberto, Bonfanti Gianluca, Emir Birol, Whalen Ed, Watt Steve, Brodsky Marina, Parsons Bruce
Global Medical Affairs, Pfizer Inc, New York, NY 10017, USA.
Health Services Consulting Corporation, Boxborough, MA 01719, USA.
Pragmat Obs Res. 2019 Oct 31;10:67-76. doi: 10.2147/POR.S214412. eCollection 2019.
Variability in patient treatment responses can be a barrier to effective care. Utilization of available patient databases may improve the prediction of treatment responses. We evaluated machine learning methods to predict novel, individual patient responses to pregabalin for painful diabetic peripheral neuropathy, utilizing an agent-based modeling and simulation platform that integrates real-world observational study (OS) data and randomized clinical trial (RCT) data.
The best supervised machine learning methods were selected (through literature review) and combined in a novel way for aligning patients with relevant subgroups that best enable prediction of pregabalin responses. Data were derived from a German OS of pregabalin (N=2642) and nine international RCTs (N=1320). Coarsened exact matching of OS and RCT patients was used and a hierarchical cluster analysis was implemented. We tested which machine learning methods would best align candidate patients with specific clusters that predict their pain scores over time. Cluster alignments would trigger assignments of cluster-specific time-series regressions with lagged variables as inputs in order to simulate "virtual" patients and generate 1000 trajectory variations for given novel patients.
Instance-based machine learning methods (k-nearest neighbor, supervised fuzzy c-means) were selected for quantitative analyses. Each method alone correctly classified 56.7% and 39.1% of patients, respectively. An "ensemble method" (combining both methods) correctly classified 98.4% and 95.9% of patients in the training and testing datasets, respectively.
An ensemble combination of two instance-based machine learning techniques best accommodated different data types (dichotomous, categorical, continuous) and performed better than either technique alone in assigning novel patients to subgroups for predicting treatment outcomes using microsimulation. Assignment of novel patients to a cluster of similar patients has the potential to improve prediction of patient outcomes for chronic conditions in which initial treatment response can be incorporated using microsimulation.
www.clinicaltrials.gov: NCT00156078, NCT00159679, NCT00143156, NCT00553475.
患者治疗反应的变异性可能成为有效治疗的障碍。利用现有的患者数据库可能会改善对治疗反应的预测。我们评估了机器学习方法,以预测糖尿病性周围神经病变患者对普瑞巴林的新的个体反应,使用了一个基于代理的建模和模拟平台,该平台整合了真实世界观察性研究(OS)数据和随机临床试验(RCT)数据。
选择最佳的监督式机器学习方法(通过文献综述),并以一种新颖的方式进行组合,以便将患者与最能预测普瑞巴林反应的相关亚组进行匹配。数据来自德国一项关于普瑞巴林的观察性研究(N = 2642)和9项国际随机临床试验(N = 1320)。对观察性研究和随机临床试验的患者进行粗化精确匹配,并实施分层聚类分析。我们测试了哪种机器学习方法能最好地将候选患者与特定聚类进行匹配,这些聚类可预测其随时间变化的疼痛评分。聚类匹配将触发以滞后变量作为输入的特定聚类时间序列回归的分配,以便模拟“虚拟”患者,并为给定的新患者生成1000种轨迹变化。
基于实例的机器学习方法(k近邻、监督模糊c均值)被选用于定量分析。单独使用每种方法分别正确分类了56.7%和39.1%的患者。一种“集成方法”(结合两种方法)在训练和测试数据集中分别正确分类了98.4%和95.9%的患者。
两种基于实例的机器学习技术的集成组合能最好地适应不同的数据类型(二分法、分类法、连续法),并且在使用微观模拟将新患者分配到亚组以预测治疗结果方面比单独使用任何一种技术表现更好。将新患者分配到相似患者的聚类中,有可能改善对慢性病患者预后的预测,在这些慢性病中,初始治疗反应可以通过微观模拟纳入考虑。
www.clinicaltrials.gov:NCT00156078、NCT00159679、NCT00143156、NCT00553475。