Suppr超能文献

机器学习和自然语言处理(NLP)方法预测激素受体阳性(HR+)/HER2 阴性晚期乳腺癌患者一线治疗的早期进展。

Machine learning and natural language processing (NLP) approach to predict early progression to first-line treatment in real-world hormone receptor-positive (HR+)/HER2-negative advanced breast cancer patients.

机构信息

Medical Oncology Intercenter Unit, Regional and Virgen de la Victoria University Hospitals, IBIMA, Málaga, Spain.

University of Málaga, Department of Languages and Computer Science, E.T.S.I. Computing, Málaga, Spain.

出版信息

Eur J Cancer. 2021 Feb;144:224-231. doi: 10.1016/j.ejca.2020.11.030. Epub 2020 Dec 26.

Abstract

BACKGROUND

CDK4/6 inhibitors plus endocrine therapies are the current standard of care in the first-line treatment of HR+/HER2-negative metastatic breast cancer, but there are no well-established clinical or molecular predictive factors for patient response. In the era of personalised oncology, new approaches for developing predictive models of response are needed.

MATERIALS AND METHODS

Data derived from the electronic health records (EHRs) of real-world patients with HR+/HER2-negative advanced breast cancer were used to develop predictive models for early and late progression to first-line treatment. Two machine learning approaches were used: a classic approach using a data set of manually extracted features from reviewed (EHR) patients, and a second approach using natural language processing (NLP) of free-text clinical notes recorded during medical visits.

RESULTS

Of the 610 patients included, there were 473 (77.5%) progressions to first-line treatment, of which 126 (20.6%) occurred within the first 6 months. There were 152 patients (24.9%) who showed no disease progression before 28 months from the onset of first-line treatment. The best predictive model for early progression using the manually extracted dataset achieved an area under the curve (AUC) of 0.734 (95% CI 0.687-0.782). Using the NLP free-text processing approach, the best model obtained an AUC of 0.758 (95% CI 0.714-0.800). The best model to predict long responders using manually extracted data obtained an AUC of 0.669 (95% CI 0.608-0.730). With NLP free-text processing, the best model attained an AUC of 0.752 (95% CI 0.705-0.799).

CONCLUSIONS

Using machine learning methods, we developed predictive models for early and late progression to first-line treatment of HR+/HER2-negative metastatic breast cancer, also finding that NLP-based machine learning models are slightly better than predictive models based on manually obtained data.

摘要

背景

CDK4/6 抑制剂联合内分泌治疗是 HR+/HER2 阴性转移性乳腺癌一线治疗的标准治疗方法,但目前尚无明确的临床或分子预测患者反应的指标。在个性化肿瘤学时代,需要开发新的方法来建立预测反应的模型。

材料和方法

使用来自 HR+/HER2 阴性晚期乳腺癌真实世界患者电子健康记录(EHR)的数据来开发预测一线治疗早期和晚期进展的模型。使用两种机器学习方法:一种是使用从回顾性 EHR 患者中提取的特征的数据集的经典方法,另一种是使用医疗就诊期间记录的自由文本临床记录的自然语言处理(NLP)的方法。

结果

在纳入的 610 例患者中,有 473 例(77.5%)进展为一线治疗,其中 126 例(20.6%)发生在一线治疗的前 6 个月内。在一线治疗开始后的 28 个月内,有 152 例(24.9%)患者没有疾病进展。使用手动提取数据集的最佳早期进展预测模型的曲线下面积(AUC)为 0.734(95%CI 0.687-0.782)。使用 NLP 自由文本处理方法,最佳模型获得 AUC 为 0.758(95%CI 0.714-0.800)。使用手动提取数据预测长反应者的最佳模型获得 AUC 为 0.669(95%CI 0.608-0.730)。使用 NLP 自由文本处理,最佳模型获得 AUC 为 0.752(95%CI 0.705-0.799)。

结论

使用机器学习方法,我们开发了预测 HR+/HER2 阴性转移性乳腺癌一线治疗早期和晚期进展的模型,还发现基于 NLP 的机器学习模型略优于基于手动获取数据的预测模型。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验