Suppr超能文献

比较机器学习算法预测慢性髓性白血病患者 5 年生存率。

Comparing machine learning algorithms to predict 5-year survival in patients with chronic myeloid leukemia.

机构信息

Department of Health Information Technology, Faculty of Paramedical, Ilam University of Medical Sciences, Ilam, Iran.

Department of Health Information Technology and Management, School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran.

出版信息

BMC Med Inform Decis Mak. 2022 Sep 6;22(1):236. doi: 10.1186/s12911-022-01980-w.

Abstract

INTRODUCTION

Chronic myeloid leukemia (CML) is a myeloproliferative disorder resulting from the translocation of chromosomes 19 and 22. CML includes 15-20% of all cases of leukemia. Although bone marrow transplant and, more recently, tyrosine kinase inhibitors (TKIs) as a first-line treatment have significantly prolonged survival in CML patients, accurate prediction using available patient-level factors can be challenging. We intended to predict 5-year survival among CML patients via eight machine learning (ML) algorithms and compare their performance.

METHODS

The data of 837 CML patients were retrospectively extracted and randomly split into training and test segments (70:30 ratio). The outcome variable was 5-year survival with potential values of alive or deceased. The dataset for the full features and important features selected by minimal redundancy maximal relevance (mRMR) feature selection were fed into eight ML techniques, including eXtreme gradient boosting (XGBoost), multilayer perceptron (MLP), pattern recognition network, k-nearest neighborhood (KNN), probabilistic neural network, support vector machine (SVM) (kernel = linear), SVM (kernel = RBF), and J-48. The scikit-learn library in Python was used to implement the models. Finally, the performance of the developed models was measured using some evaluation criteria with 95% confidence intervals (CI).

RESULTS

Spleen palpable, age, and unexplained hemorrhage were identified as the top three effective features affecting CML 5-year survival. The performance of ML models using the selected-features was superior to that of the full-features dataset. Among the eight ML algorithms, SVM (kernel = RBF) had the best performance in tenfold cross-validation with an accuracy of 85.7%, specificity of 85%, sensitivity of 86%, F-measure of 87%, kappa statistic of 86.1%, and area under the curve (AUC) of 85% for the selected-features. Using the full-features dataset yielded an accuracy of 69.7%, specificity of 69.1%, sensitivity of 71.3%, F-measure of 72%, kappa statistic of 75.2%, and AUC of 70.1%.

CONCLUSIONS

Accurate prediction of the survival likelihood of CML patients can inform caregivers to promote patient prognostication and choose the best possible treatment path. While external validation is required, our developed models will offer customized treatment and may guide the prescription of personalized medicine for CML patients.

摘要

简介

慢性髓性白血病(CML)是一种骨髓增生性疾病,源于染色体 19 和 22 的易位。CML 占所有白血病病例的 15-20%。尽管骨髓移植和最近的酪氨酸激酶抑制剂(TKI)作为一线治疗方法显著延长了 CML 患者的生存时间,但使用现有患者水平的因素进行准确预测可能具有挑战性。我们旨在通过 8 种机器学习(ML)算法预测 CML 患者的 5 年生存率,并比较它们的性能。

方法

回顾性提取 837 例 CML 患者的数据,并将其随机分为训练和测试段(70:30 比例)。结局变量为 5 年生存率,可能的值为存活或死亡。将全特征数据集和通过最小冗余最大相关性(mRMR)特征选择选择的重要特征数据集输入到 8 种 ML 技术中,包括极端梯度增强(XGBoost)、多层感知机(MLP)、模式识别网络、k-最近邻(KNN)、概率神经网络、支持向量机(SVM)(核=线性)、SVM(核=RBF)和 J-48。使用 Python 中的 scikit-learn 库实现模型。最后,使用 95%置信区间(CI)的一些评估标准来衡量开发模型的性能。

结果

脾肿大、年龄和不明原因出血被确定为影响 CML 5 年生存率的前三个有效特征。使用选定特征的 ML 模型的性能优于全特征数据集。在 8 种 ML 算法中,SVM(核=RBF)在 10 倍交叉验证中的表现最佳,准确率为 85.7%,特异性为 85%,灵敏度为 86%,F 度量为 87%,kappa 统计量为 86.1%,选定特征的曲线下面积(AUC)为 85%。使用全特征数据集的准确率为 69.7%,特异性为 69.1%,灵敏度为 71.3%,F 度量为 72%,kappa 统计量为 75.2%,AUC 为 70.1%。

结论

准确预测 CML 患者的生存可能性可以为护理人员提供信息,以促进患者预后判断,并选择最佳的治疗途径。虽然需要外部验证,但我们开发的模型将提供定制的治疗,并可能指导 CML 患者的个性化药物处方。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9313/9450320/ceb9dc2373c3/12911_2022_1980_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验