Department of Evidence-Based Medicine and Clinical Epidemiology, School of Medicine/West China Hospital, Sichuan University, No. 17, Section 3, Renmin South Road, Chengdu, 610041, Sichuan, China.
College of Computer Science, Sichuan University, Chengdu, Sichuan, China.
Sci Rep. 2021 Jul 2;11(1):13778. doi: 10.1038/s41598-021-93317-2.
Patients requiring low-dose warfarin are more likely to suffer bleeding due to overdose. The goal of this work is to improve the feedforward neural network model's precision in predicting the low maintenance dose for Chinese in the aspect of training data construction. We built the model from a resampled dataset created by equal stratified sampling (maintaining the same sample number in three dose-groups with a total of 3639) and performed internal and external validations. Comparing to the model trained from the raw dataset of 19,060 eligible cases, we improved the low-dose group's ideal prediction percentage from 0.7 to 9.6% and maintained the overall performance (76.4% vs. 75.6%) in external validation. We further built neural network models on single-dose subsets to invest whether the subsets samples were sufficient and whether the selected factors were appropriate. The training set sizes were 1340 and 1478 for the low and high dose subsets; the corresponding ideal prediction percentages were 70.2% and 75.1%. The training set size for the intermediate dose varied and was 1553, 6214, and 12,429; the corresponding ideal prediction percentages were 95.6, 95.1%, and 95.3%. Our conclusion is that equal stratified sampling can be a considerable alternative approach in training data construction to build drug dosing models in the clinic.
需要低剂量华法林的患者更容易因过量服用而出血。本研究的目的是通过构建训练数据,提高前馈神经网络模型在中国人群中预测低维持剂量的精度。我们通过等分层抽样(在三个剂量组中保持相同的样本数量,每组共 3639 个)构建模型,并进行内部和外部验证。与从 19060 例合格病例的原始数据集训练的模型相比,我们将低剂量组的理想预测百分比从 0.7%提高到 9.6%,并在外部验证中保持了整体性能(76.4%对 75.6%)。我们进一步在单剂量亚组上构建神经网络模型,以研究亚组样本是否充足,所选因素是否合适。低剂量和高剂量亚组的训练集大小分别为 1340 和 1478;相应的理想预测百分比分别为 70.2%和 75.1%。中剂量的训练集大小不同,分别为 1553、6214 和 12429;相应的理想预测百分比分别为 95.6%、95.1%和 95.3%。我们的结论是,等分层抽样可以作为训练数据构建中的一种替代方法,用于在临床中构建药物剂量模型。