Jiangsu Province Blood Center, Nanjing, Jiangsu, People's Republic of China.
Yangzhou Blood Station, Yangzhou, Jiangsu, People's Republic of China.
Sci Rep. 2022 Nov 10;12(1):19165. doi: 10.1038/s41598-022-21215-2.
Machine learning methods are a novel way to predict and rank donors' willingness to donate blood and to achieve precision recruitment, which can improve the recruitment efficiency and meet the challenge of blood shortage. We collected information about experienced blood donors via short message service (SMS) recruitment and developed 7 machine learning-based recruitment models using PyCharm-Python Environment and 13 features which were described as a method for ranking and predicting donors' intentions to donate blood with a floating number between 0 and 1. Performance of the prediction models was assessed by the Area under the receiver operating characteristic curve (AUC), accuracy, precision, recall, and F1 score in the full dataset, and by the accuracy in the four sub-datasets. The developed models were applied to prospective validations of recruiting experienced blood donors during two COVID-19 pandemics, while the routine method was used as a control. Overall, a total of 95,476 recruitments via SMS and their donation results were enrolled in our modelling study. The strongest predictor features for the donation of experienced donors were blood donation interval, age, and donation frequency. Among the seven baseline models, the eXtreme Gradient Boosting (XGBoost) and Support vector machine models (SVM) achieved the best performance: mean (95%CI) with the highest AUC: 0.809 (0.806-0.811), accuracy: 0.815 (0.812-0.818), precision: 0.840 (0.835-0.845), and F1 score of XGBoost: 0.843 (0.840-0.845) and recall of SVM: 0.991 (0.988-0.994). The hit rate of the XGBoost model alone and the combined XGBoost and SVM models were 1.25 and 1.80 times higher than that of the conventional method as a control in 2 recruitments respectively, and the hit rate of the high willingness to donate group was 1.96 times higher than that of the low willingness to donate group. Our results suggested that the machine learning models could predict and determine the experienced donors with a strong willingness to donate blood by a ranking score based on personalized donation data and demographical details, significantly improve the recruitment rate of blood donors and help blood agencies to maintain the blood supply in emergencies.
机器学习方法是一种预测和评估献血者献血意愿并实现精准招募的新方法,可提高招募效率,应对血液短缺挑战。我们通过短信服务(SMS)招募收集了有经验献血者的信息,并使用 PyCharm-Python 环境和 13 个特征开发了 7 个基于机器学习的招募模型,这些特征被描述为一种对献血者献血意愿进行排名和预测的方法,得分在 0 到 1 之间浮动。通过在全数据集和四个子数据集中评估预测模型的 AUC(ROC 曲线下面积)、准确性、精度、召回率和 F1 评分来评估预测模型的性能。在两次 COVID-19 大流行期间,将开发的模型应用于前瞻性招募有经验的献血者,同时将常规方法作为对照。总的来说,共招募了 95476 名 SMS 献血者,并对他们的献血结果进行了建模研究。对经验丰富的献血者进行献血的最强预测特征是献血间隔、年龄和献血频率。在这 7 个基线模型中,极端梯度提升(XGBoost)和支持向量机模型(SVM)的性能最佳:最高 AUC 的平均值(95%CI):0.809(0.806-0.811)、准确性:0.815(0.812-0.818)、精度:0.840(0.835-0.845)和 XGBoost 的 F1 分数:0.843(0.840-0.845)以及 SVM 的召回率:0.991(0.988-0.994)。XGBoost 模型单独和 XGBoost 和 SVM 模型联合的命中率分别比作为对照的常规方法高 1.25 倍和 1.80 倍,高献血意愿组的命中率比低献血意愿组高 1.96 倍。我们的结果表明,机器学习模型可以通过基于个性化献血数据和人口统计细节的排名分数预测和确定有强烈献血意愿的经验丰富的献血者,显著提高献血者的招募率,并帮助血液机构在紧急情况下维持血液供应。