IEEE J Biomed Health Inform. 2019 Jan;23(1):395-406. doi: 10.1109/JBHI.2018.2812165. Epub 2018 Mar 5.
An evolutionary ensemble modeling (EEM) method is developed to improve the accuracy of warfarin dose prediction. In EEM, genetic programming (GP) evolves diverse base models, and the genetic algorithm optimizes the parameters of the GP. The EEM model is assembled by using the prepared base models through a technique called "bagging." In the experiment, a dataset of 289 Chinese patients, which was provided by the First Affiliated Hospital of Soochow University, is used for training, validation, and testing. The EEM model with selected feature groups is benchmarked with four machine-learning methods and three conventional regression models. Results show that the EEM model with the M2+G group, namely age, height, weight, gender, CYP2C9, VKORC1, and amiodarone, presents the largest coefficients of determination (R), the highest percentage of the predicted dose within 20% of the actual dose (20%-p), the smallest mean absolute error, mean squared error, and root-mean-squared error on the test set, and the least decrease in R from the training set to the test set. In conclusion, the EEM method with M2+G delivers superior performance and can, therefore, be a suitable prediction model of warfarin dose for clinical applications.
一种进化集成建模 (EEM) 方法被开发出来以提高华法林剂量预测的准确性。在 EEM 中,遗传编程 (GP) 进化出不同的基础模型,遗传算法优化 GP 的参数。EEM 模型通过称为“装袋”的技术使用准备好的基础模型进行组装。在实验中,使用来自苏州大学第一附属医院的 289 名中国患者的数据集进行训练、验证和测试。具有选定特征组的 EEM 模型与四种机器学习方法和三种传统回归模型进行了基准测试。结果表明,具有 M2+G 组(即年龄、身高、体重、性别、CYP2C9、VKORC1 和胺碘酮)的 EEM 模型具有最大的确定系数 (R)、预测剂量在实际剂量的 20%以内的比例最高 (20%-p)、平均绝对误差、均方误差和均方根误差最小,以及从训练集到测试集的 R 值下降最小。总之,具有 M2+G 的 EEM 方法表现出色,因此可以成为临床应用中华法林剂量的合适预测模型。