Suppr超能文献

基于人工智能算法预测乳腺癌术后患者复发风险的创新模型。

The innovative model based on artificial intelligence algorithms to predict recurrence risk of patients with postoperative breast cancer.

作者信息

Zeng Lixuan, Liu Lei, Chen Dongxin, Lu Henghui, Xue Yang, Bi Hongjie, Yang Weiwei

机构信息

Department of Pathology, Harbin Medical University, Harbin, China.

Department of Breast Surgery, The Third Affiliated Hospital of Harbin Medical University, Harbin, China.

出版信息

Front Oncol. 2023 Mar 7;13:1117420. doi: 10.3389/fonc.2023.1117420. eCollection 2023.

Abstract

PURPOSE

This study aimed to develop a machine learning model to retrospectively study and predict the recurrence risk of breast cancer patients after surgery by extracting the clinicopathological features of tumors from unstructured clinical electronic health record (EHR) data.

METHODS

This retrospective cohort included 1,841 breast cancer patients who underwent surgical treatment. To extract the principal features associated with recurrence risk, the clinical notes and histopathology reports of patients were collected and feature engineering was used. Predictive models were next conducted based on this important information. All algorithms were implemented using Python software. The accuracy of prediction models was further verified in the test cohort. The area under the curve (AUC), precision, recall, and F1 score were adopted to evaluate the performance of each model.

RESULTS

A training cohort with 1,289 patients and a test cohort with 552 patients were recruited. From 2011 to 2019, a total of 1,841 textual reports were included. For the prediction of recurrence risk, both LSTM, XGBoost, and SVM had favorable accuracies of 0.89, 0.86, and 0.78. The AUC values of the micro-average ROC curve corresponding to LSTM, XGBoost, and SVM were 0.98 ± 0.01, 0.97 ± 0.03, and 0.92 ± 0.06. Especially the LSTM model achieved superior execution than other models. The accuracy, F1 score, macro-avg F1 score (0.87), and weighted-avg F1 score (0.89) of the LSTM model produced higher values. All values were statistically significant. Patients in the high-risk group predicted by our model performed more resistant to DNA damage and microtubule targeting drugs than those in the intermediate-risk group. The predicted low-risk patients were not statistically significant compared with intermediate- or high-risk patients due to the small sample size (188 low-risk patients were predicted our model, and only two of them were administered chemotherapy alone after surgery). The prognosis of patients predicted by our model was consistent with the actual follow-up records.

CONCLUSIONS

The constructed model accurately predicted the recurrence risk of breast cancer patients from EHR data and certainly evaluated the chemoresistance and prognosis of patients. Therefore, our model can help clinicians to formulate the individualized management of breast cancer patients.

摘要

目的

本研究旨在开发一种机器学习模型,通过从非结构化临床电子健康记录(EHR)数据中提取肿瘤的临床病理特征,回顾性研究和预测乳腺癌患者术后的复发风险。

方法

本回顾性队列研究纳入了1841例接受手术治疗的乳腺癌患者。为了提取与复发风险相关的主要特征,收集了患者的临床记录和组织病理学报告,并进行了特征工程。接下来基于这些重要信息构建预测模型。所有算法均使用Python软件实现。在测试队列中进一步验证了预测模型的准确性。采用曲线下面积(AUC)、精确率、召回率和F1分数来评估每个模型的性能。

结果

招募了一个包含1289例患者的训练队列和一个包含552例患者的测试队列。从2011年到2019年,共纳入1841份文本报告。对于复发风险的预测,长短期记忆网络(LSTM)、极端梯度提升(XGBoost)和支持向量机(SVM)的准确率分别为0.89、0.86和0.78,表现良好。LSTM、XGBoost和SVM对应的微平均ROC曲线的AUC值分别为0.98±0.01、0.97±0.03和0.92±0.06。特别是LSTM模型的表现优于其他模型。LSTM模型的准确率、F1分数、宏平均F1分数(0.87)和加权平均F1分数(0.89)更高。所有值均具有统计学意义。我们模型预测的高危组患者对DNA损伤和微管靶向药物的耐药性高于中危组患者。由于样本量较小(我们的模型预测了188例低危患者,其中只有2例术后单独接受了化疗),预测的低危患者与中危或高危患者相比无统计学意义。我们模型预测的患者预后与实际随访记录一致。

结论

构建的模型能够准确地从EHR数据中预测乳腺癌患者的复发风险,并可靠地评估患者的化疗耐药性和预后。因此,我们的模型可以帮助临床医生制定乳腺癌患者的个体化管理方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b3f/10029918/9f212e4919d2/fonc-13-1117420-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验