基于自然语言处理和机器学习的乳腺癌病历预后分期预测的信息提取。

Information extraction for prognostic stage prediction from breast cancer medical records using NLP and ML.

机构信息

School of Computer Engineering and Technology, MIT World Peace University, Pune, India, 411029.

Department of Computer Engineering and Information Technology, College of Engineering, Pune, 411005, India.

出版信息

Med Biol Eng Comput. 2021 Sep;59(9):1751-1772. doi: 10.1007/s11517-021-02399-7. Epub 2021 Jul 23.

DOI:10.1007/s11517-021-02399-7

PMID:34297300

Abstract

For cancer prediction, the prognostic stage is the main factor that helps medical experts to decide the optimal treatment for a patient. Specialists study prognostic stage information from medical reports, often in an unstructured form, and take a larger review time. The main objective of this study is to suggest a generic clinical decision-unifying staging method to extract the most reliable prognostic stage information of breast cancer from medical records of various health institutions. Additional prognostic elements should be extracted from medical reports to identify the cancer stage for getting an exact measure of cancer and improving care quality. This study has collected 465 pathological and clinical reports of breast cancer sufferers from India's reputed medical institutions. The unstructured records were found distinct from each institute. Anatomic and biologic factors are extracted from medical records using the natural language processing, machine learning and rule-based method for prognostic stage detection. This study has extracted anatomic stage, grade, estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2) from medical reports with high accuracy and predicted prognostic stage for both regions. The prognostic stage prediction's average accuracy is found 92% and 82% in rural and urban areas, respectively. It was essential to combine biological and anatomical elements under a single prognostic staging method. A generic clinical decision-unifying staging method for prognostic stage detection with great accuracy in various institutions of different regional areas suggests that the proposed research improves the prognosis of breast cancer.

摘要

对于癌症预测，预后分期是帮助医学专家为患者确定最佳治疗方案的主要因素。专家从医疗报告中研究预后分期信息，这些信息通常以非结构化的形式呈现，需要花费更多的时间进行审查。本研究的主要目的是提出一种通用的临床决策统一分期方法，从不同医疗机构的病历中提取乳腺癌最可靠的预后分期信息。还应从医疗报告中提取额外的预后因素，以确定癌症分期，从而更准确地衡量癌症并提高护理质量。本研究从印度知名医疗机构收集了 465 份乳腺癌患者的病理和临床报告。未结构化的记录在每个机构之间存在明显差异。使用自然语言处理、机器学习和基于规则的方法从医疗记录中提取解剖和生物学因素，以检测预后分期。本研究从医疗报告中以高精度提取了解剖分期、分级、雌激素受体 (ER)、孕激素受体 (PR) 和人表皮生长因子受体 2 (HER2)，并对两个地区的预后分期进行了预测。在农村和城市地区，预后分期预测的平均准确率分别为 92%和 82%。将生物学和解剖学因素结合在单一的预后分期方法中是至关重要的。本研究提出了一种通用的临床决策统一分期方法，用于在不同地区的不同医疗机构中进行预后分期检测，具有很高的准确性，表明所提出的研究提高了乳腺癌的预后。

相似文献

Information extraction for prognostic stage prediction from breast cancer medical records using NLP and ML.基于自然语言处理和机器学习的乳腺癌病历预后分期预测的信息提取。

Med Biol Eng Comput. 2021 Sep;59(9):1751-1772. doi: 10.1007/s11517-021-02399-7. Epub 2021 Jul 23.

Prognostic elements extraction from documents to detect prognostic stage.从文档中提取预后元素以检测预后阶段。

Comput Methods Biomech Biomed Engin. 2022 Mar;25(4):371-386. doi: 10.1080/10255842.2021.1955359. Epub 2021 Jul 28.

Evaluation of the prognostic stage in the 8th edition of the American Joint Committee on Cancer in locally advanced breast cancer: An analysis based on SEER 18 database.第八版美国癌症联合委员会肿瘤分期系统在局部晚期乳腺癌中的预后评估：基于 SEER18 数据库的分析。

Breast. 2018 Feb;37:56-63. doi: 10.1016/j.breast.2017.10.011. Epub 2017 Oct 31.

The assessment of 8th edition AJCC prognostic staging system and a simplified staging system for breast cancer: The analytic results from the SEER database.第8版美国癌症联合委员会（AJCC）乳腺癌预后分期系统及简化分期系统的评估：来自监测、流行病学和最终结果（SEER）数据库的分析结果

Breast J. 2019 Sep;25(5):838-847. doi: 10.1111/tbj.13347. Epub 2019 Jun 13.

Personalizing breast cancer staging by the inclusion of ER, PR, and HER2.通过纳入 ER、PR 和 HER2 对乳腺癌进行个体化分期。

JAMA Surg. 2014 Feb;149(2):125-9. doi: 10.1001/jamasurg.2013.3181.

Machine learning and natural language processing (NLP) approach to predict early progression to first-line treatment in real-world hormone receptor-positive (HR+)/HER2-negative advanced breast cancer patients.机器学习和自然语言处理（NLP）方法预测激素受体阳性（HR+）/HER2 阴性晚期乳腺癌患者一线治疗的早期进展。

Eur J Cancer. 2021 Feb;144:224-231. doi: 10.1016/j.ejca.2020.11.030. Epub 2020 Dec 26.

Prognostic values of negative estrogen or progesterone receptor expression in patients with luminal B HER2-negative breast cancer.雌激素或孕激素受体阴性表达在腔面B型HER2阴性乳腺癌患者中的预后价值

World J Surg Oncol. 2016 Sep 13;14(1):244. doi: 10.1186/s12957-016-0999-x.

Is the TNM staging system for breast cancer still relevant in the era of biomarkers and emerging personalized medicine for breast cancer - an institution's 10-year experience.在生物标志物和乳腺癌新兴个性化医疗时代，乳腺癌的TNM分期系统是否仍然适用——一家机构的10年经验

Breast J. 2015 Mar-Apr;21(2):147-54. doi: 10.1111/tbj.12367. Epub 2015 Jan 20.

New and Important Changes in the TNM Staging System for Breast Cancer.乳腺癌TNM分期系统的新的重要变化。

Am Soc Clin Oncol Educ Book. 2018 May 23;38:457-467. doi: 10.1200/EDBK_201313.

The effect of the American Joint Committee on Cancer eighth edition on breast cancer staging and prognostication.第八版美国癌症联合委员会（AJCC）对乳腺癌分期和预后的影响。

Eur J Surg Oncol. 2019 Oct;45(10):1817-1820. doi: 10.1016/j.ejso.2019.03.027. Epub 2019 Mar 23.

引用本文的文献

Collecting routine and timely cancer stage at diagnosis by implementing a cancer staging tiered framework: the Western Australian Cancer Registry experience.通过实施癌症分期分层框架来收集常规和及时的诊断癌症分期：西澳大利亚癌症登记处的经验。

BMC Health Serv Res. 2024 Jun 28;24(1):770. doi: 10.1186/s12913-024-11224-4.

Natural Language Processing for Breast Imaging: A Systematic Review.用于乳腺成像的自然语言处理：一项系统综述。

Diagnostics (Basel). 2023 Apr 14;13(8):1420. doi: 10.3390/diagnostics13081420.

The innovative model based on artificial intelligence algorithms to predict recurrence risk of patients with postoperative breast cancer.基于人工智能算法预测乳腺癌术后患者复发风险的创新模型。

Front Oncol. 2023 Mar 7;13:1117420. doi: 10.3389/fonc.2023.1117420. eCollection 2023.

Natural Language Processing Applications for Computer-Aided Diagnosis in Oncology.用于肿瘤学计算机辅助诊断的自然语言处理应用

Diagnostics (Basel). 2023 Jan 12;13(2):286. doi: 10.3390/diagnostics13020286.

本文引用的文献

A machine learning-based prognostic predictor for stage III colon cancer.基于机器学习的 III 期结肠癌预后预测器。

Sci Rep. 2020 Jun 25;10(1):10333. doi: 10.1038/s41598-020-67178-0.

Prediction of new associations between ncRNAs and diseases exploiting multi-type hierarchical clustering.利用多类型层次聚类技术预测 ncRNAs 与疾病之间的新关联。

BMC Bioinformatics. 2020 Feb 24;21(1):70. doi: 10.1186/s12859-020-3392-2.

Predicted Prognosis of Patients with Pancreatic Cancer by Machine Learning.基于机器学习预测胰腺癌患者的预后。

Clin Cancer Res. 2020 May 15;26(10):2411-2421. doi: 10.1158/1078-0432.CCR-19-1247. Epub 2020 Jan 28.

Exploiting transfer learning for the reconstruction of the human gene regulatory network.利用迁移学习重建人类基因调控网络。

Bioinformatics. 2020 Mar 1;36(5):1553-1561. doi: 10.1093/bioinformatics/btz781.

Validation of the AJCC 8th prognostic system for breast cancer in an Asian healthcare setting.验证 AJCC 8 版乳腺癌预后系统在亚洲医疗保健环境中的适用性。

Breast. 2018 Aug;40:38-44. doi: 10.1016/j.breast.2018.04.013. Epub 2018 Apr 17.

Breast. 2018 Feb;37:56-63. doi: 10.1016/j.breast.2017.10.011. Epub 2017 Oct 31.

Cancer Registration in India - Current Scenario and Future Perspectives.印度的癌症登记——现状与未来展望

Asian Pac J Cancer Prev. 2016;17(8):3687-96.

A natural language processing program effectively extracts key pathologic findings from radical prostatectomy reports.一个自然语言处理程序能有效地从根治性前列腺切除术报告中提取关键病理结果。

J Endourol. 2014 Dec;28(12):1474-8. doi: 10.1089/end.2014.0221.

Cross-hospital portability of information extraction of cancer staging information.癌症分期信息的信息提取的跨医院可移植性。

Artif Intell Med. 2014 Sep;62(1):11-21. doi: 10.1016/j.artmed.2014.06.002. Epub 2014 Jun 21.

The feasibility of using natural language processing to extract clinical information from breast pathology reports.利用自然语言处理从乳腺病理报告中提取临床信息的可行性。

J Pathol Inform. 2012;3:23. doi: 10.4103/2153-3539.97788. Epub 2012 Jun 30.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于自然语言处理和机器学习的乳腺癌病历预后分期预测的信息提取。

Information extraction for prognostic stage prediction from breast cancer medical records using NLP and ML.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献