• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

仅使用廉价检测指标进行慢性淋巴细胞白血病治疗预测的可解释机器学习

Explainable machine learning for chronic lymphocytic leukemia treatment prediction using only inexpensive tests.

机构信息

Department of Software and Information Systems Engineering, Ben Gurion University of the Negev, P.O.B. 653, Be'er Sheva, 8410501, Israel.

Internal Medicine C, Bnai Zion Medical Center, Haifa, Israel.

出版信息

Comput Biol Med. 2022 Jun;145:105490. doi: 10.1016/j.compbiomed.2022.105490. Epub 2022 Apr 6.

DOI:10.1016/j.compbiomed.2022.105490
PMID:35405402
Abstract

BACKGROUND

Chronic lymphocytic leukemia (CLL) is one of the most common types of leukemia in the western world which affects mainly the elderly population. Progress of the disease is very heterogeneous both in terms of necessity of treatment and life expectancy. The current scoring system for prognostic evaluation of patients with CLL is called CLL-IPI and predicts the general progress of the disease but is not a measure or a decision aid for the necessity of treatment. Due to the heterogeneous behavior of CLL it is important to develop tools that will identify if and when patients will necessitate treatment for CLL. Recently, Machine Learning (ML) has spread to many public health fields including diagnosis and prognosis of diseases.

OBJECTIVE

Existing machine learning methods for CLL treatment prediction rely on expensive tests, such as genetic tests, rendering them useless in peripheral or low-resource clinics such as those in developing countries. We aim to develop a model for predicting whether a patient will need treatment for CLL within two years of diagnosis using a machine learning model based on only on demographic data and routine laboratory tests.

METHOD

We conducted a single center study that included adult patients (above the age of 18) that were diagnosed with CLL according to the IWCLL criteria and were under observation at the hematology unit of the Bnai-Zion medical center between 2009 and 2019. Patient data include demographic, clinical and laboratory measures that were extracted from patients' medical records anonymously. All laboratory results, during the observation period, were extracted for the entire cohort. Multiple ML approaches for classifying whether a patient will require treatment during a predetermined period of 2 years were evaluated. Performance of the ML models was measured using repeated cross validation. We evaluated the use of SHapley Additive exPlanation (SHAP) for explaining what influences the models decision. Additionally, we employ a method for extracting a single decision tree from the ML model which enables the doctor to understand the main logic governing the model prediction.

RESULTS

The study included 109 patients of them 67 males (61%). Patients were under observation for a median of 44 months and the median age was 65 (age range: 45-87). 64% of the cohort received therapy during follow-up. A Gradient Boosting Model (GBM) model using all of the extracted variables to identify the need for treatment in the coming two years among patients with CLL achieved the AUPRC of 0.78 (±0.08). An identical GBM model, without genetic/FISH and flowcytometry (FACS) data, such that it can be used in peripheral clinics, scored an AUPRC of 0.7686 (±0.0837). A Generalized Linear Model (GLM) using the same features, scored an AUPRC of 0.7535 (±0.0995). All the models described above surpassed the performance of CLL-IPI that was evaluated using the CLL-TIM model. According to the SHAP results, red blood cell (RBC) count was the most predictive value for the necessity for treatment, where a high value is associated with a low probability of requiring treatment in the coming two years. Additionally, the SHAP method was used for estimating the personal risk of a random patient and showed sensible results. A simple Decision Tree classifier showed that patients who had a hemoglobin level of less than 13 gm/dL and a Neutrophil to Lymphocyte Ratio (NLR) less than 0.063, which constituted 34% percent of the patients included in our study, had a high probability (76%) of requiring treatment.

CONCLUSIONS

Machine Learning algorithms that were evaluated in this work for predicting the necessity of treatment for patients with CLL achieved reasonable accuracy which surpassed that of CLL-IPI which was evaluated using the CLL-TIM model. Furthermore, we found that a machine learning model trained exclusively using inexpensive features only incurred a modest decrease in performance compared to the model trained using all of the features. Due to the small number of patients in this study it is necessary to validate the results on a larger population.

摘要

背景

慢性淋巴细胞白血病(CLL)是西方世界最常见的白血病类型之一,主要影响老年人群。疾病的进展在治疗的必要性和预期寿命方面都非常不同。目前用于评估 CLL 患者预后的评分系统称为 CLL-IPI,可预测疾病的总体进展,但不能作为治疗必要性的衡量标准或决策辅助工具。由于 CLL 的表现形式多种多样,因此开发能够确定患者何时需要治疗 CLL 的工具非常重要。最近,机器学习(ML)已应用于许多公共卫生领域,包括疾病的诊断和预后。

目的

现有的用于预测 CLL 治疗的机器学习方法依赖于昂贵的测试,如基因测试,因此在发展中国家等外围或资源有限的诊所中无法使用。我们旨在开发一种基于仅基于人口统计学数据和常规实验室测试的模型,用于预测患者在诊断后两年内是否需要治疗 CLL。

方法

我们进行了一项单中心研究,纳入了根据 IWCLL 标准诊断为 CLL 且在 2009 年至 2019 年期间在 Bnai-Zion 医学中心血液科观察的成年患者(年龄大于 18 岁)。患者数据包括从患者病历中匿名提取的人口统计学、临床和实验室测量值。在观察期间,提取了整个队列的所有实验室结果。评估了多种用于分类患者在预定的 2 年内是否需要治疗的 ML 方法。使用重复交叉验证来衡量 ML 模型的性能。我们评估了使用 SHapley Additive exPlanation (SHAP) 来解释模型决策的影响。此外,我们采用了从 ML 模型中提取单个决策树的方法,使医生能够理解主导模型预测的主要逻辑。

结果

该研究纳入了 109 名患者,其中 67 名男性(61%)。患者的中位观察期为 44 个月,中位年龄为 65 岁(年龄范围:45-87 岁)。64%的患者在随访期间接受了治疗。使用所有提取变量的梯度提升模型(GBM)模型来识别 CLL 患者在未来两年内的治疗需求,其 AUC-PRC 为 0.78(±0.08)。一个没有遗传/FISH 和流式细胞术(FACS)数据的相同 GBM 模型,使其可以在周边诊所使用,其 AUC-PRC 评分为 0.7686(±0.0837)。使用相同特征的广义线性模型(GLM),其 AUC-PRC 评分为 0.7535(±0.0995)。上述所有模型的表现均优于使用 CLL-TIM 模型评估的 CLL-IPI。根据 SHAP 结果,红细胞(RBC)计数是预测治疗必要性的最具预测性的值,高值与未来两年内需要治疗的可能性较低相关。此外,还使用 SHAP 方法来估计随机患者的个人风险,并显示出合理的结果。简单的决策树分类器显示,血红蛋白水平低于 13 g/dL 和中性粒细胞与淋巴细胞比值(NLR)低于 0.063 的患者(占本研究纳入患者的 34%)有很高的可能性(76%)需要治疗。

结论

在这项工作中评估的用于预测 CLL 患者治疗必要性的机器学习算法达到了合理的准确性,超过了使用 CLL-TIM 模型评估的 CLL-IPI。此外,我们发现仅使用廉价特征训练的机器学习模型与使用所有特征训练的模型相比,性能略有下降。由于本研究中患者人数较少,因此有必要在更大的人群中验证结果。

相似文献

1
Explainable machine learning for chronic lymphocytic leukemia treatment prediction using only inexpensive tests.仅使用廉价检测指标进行慢性淋巴细胞白血病治疗预测的可解释机器学习
Comput Biol Med. 2022 Jun;145:105490. doi: 10.1016/j.compbiomed.2022.105490. Epub 2022 Apr 6.
2
Development and Validation of an Explainable Machine Learning Model for Predicting Myocardial Injury After Noncardiac Surgery in Two Centers in China: Retrospective Study.中国两个中心用于预测非心脏手术后心肌损伤的可解释机器学习模型的开发与验证:一项回顾性研究
JMIR Aging. 2024 Jul 26;7:e54872. doi: 10.2196/54872.
3
Interpretable machine learning model for early prediction of 28-day mortality in ICU patients with sepsis-induced coagulopathy: development and validation.用于脓毒症诱导性凝血病 ICU 患者 28 天死亡率早期预测的可解释机器学习模型:开发与验证。
Eur J Med Res. 2024 Jan 3;29(1):14. doi: 10.1186/s40001-023-01593-7.
4
Predicting Colorectal Cancer Survival Using Time-to-Event Machine Learning: Retrospective Cohort Study.基于生存事件的机器学习预测结直肠癌患者生存情况:回顾性队列研究。
J Med Internet Res. 2023 Oct 26;25:e44417. doi: 10.2196/44417.
5
Development and Validation of an Explainable Machine Learning Model for Major Complications After Cytoreductive Surgery.细胞减灭术后主要并发症的可解释机器学习模型的开发和验证。
JAMA Netw Open. 2022 May 2;5(5):e2212930. doi: 10.1001/jamanetworkopen.2022.12930.
6
Predicting Mortality in Intensive Care Unit Patients With Heart Failure Using an Interpretable Machine Learning Model: Retrospective Cohort Study.利用可解释机器学习模型预测重症监护病房心力衰竭患者的死亡率:回顾性队列研究。
J Med Internet Res. 2022 Aug 9;24(8):e38082. doi: 10.2196/38082.
7
Analysis of clinical prognostic variables for Chronic Lymphocytic Leukemia decision-making problems.慢性淋巴细胞白血病决策问题的临床预后变量分析。
J Biomed Inform. 2016 Apr;60:342-51. doi: 10.1016/j.jbi.2016.02.017. Epub 2016 Mar 5.
8
Explainable machine learning in outcome prediction of high-grade aneurysmal subarachnoid hemorrhage.可解释机器学习在高级别动脉瘤性蛛网膜下腔出血结局预测中的应用。
Aging (Albany NY). 2024 Mar 1;16(5):4654-4669. doi: 10.18632/aging.205621.
9
Diagnosis of Chronic Lymphocytic Leukemia Using iwCLL 2018 Compared with NCI-WG96 Criteria in Cipto Mangunkusumo Hospital: A Practical Consideration in Resource Limited Setting.在西爪哇省勿加泗医院使用2018年国际慢性淋巴细胞白血病工作组(iwCLL)标准与美国国立癌症研究所工作组96(NCI-WG96)标准诊断慢性淋巴细胞白血病:资源有限环境下的实际考量
Acta Med Indones. 2022 Oct;54(4):531-539.
10
Towards proactive palliative care in oncology: developing an explainable EHR-based machine learning model for mortality risk prediction.迈向肿瘤学积极的姑息治疗:开发基于可解释电子健康记录的机器学习模型进行死亡率风险预测。
BMC Palliat Care. 2024 May 20;23(1):124. doi: 10.1186/s12904-024-01457-9.

引用本文的文献

1
The application of explainable artificial intelligence (XAI) in electronic health record research: A scoping review.可解释人工智能(XAI)在电子健康记录研究中的应用:一项范围综述。
Digit Health. 2024 Oct 30;10:20552076241272657. doi: 10.1177/20552076241272657. eCollection 2024 Jan-Dec.
2
The Five "Ws" of Frailty Assessment and Chronic Lymphocytic Leukemia: Who, What, Where, Why, and When.虚弱评估与慢性淋巴细胞白血病的五个“W”:何人、何事、何地、为何、何时。
Cancers (Basel). 2023 Sep 2;15(17):4391. doi: 10.3390/cancers15174391.
3
Pattern recognition of hematological profiles of tumors of the digestive tract: an exploratory study.
消化道肿瘤血液学特征的模式识别:一项探索性研究。
Front Med (Lausanne). 2023 Aug 16;10:1208022. doi: 10.3389/fmed.2023.1208022. eCollection 2023.