Suppr超能文献

10 年间基于自然语言处理的大规模结构化放射学报告,以识别有或无脾肿大的肿瘤患者。

Natural Language Processing of Large-Scale Structured Radiology Reports to Identify Oncologic Patients With or Without Splenomegaly Over a 10-Year Period.

机构信息

Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY.

School of Computing, Queen's University, Kingston, Ontario, Canada.

出版信息

JCO Clin Cancer Inform. 2022 Jan;6:e2100104. doi: 10.1200/CCI.21.00104.

Abstract

PURPOSE

To assess the accuracy of a natural language processing (NLP) model in extracting splenomegaly described in patients with cancer in structured computed tomography radiology reports.

METHODS

In this retrospective study between July 2009 and April 2019, 3,87,359 consecutive structured radiology reports for computed tomography scans of the chest, abdomen, and pelvis from 91,665 patients spanning 30 types of cancer were included. A randomized sample of 2,022 reports from patients with colorectal cancer, hepatobiliary cancer (HB), leukemia, Hodgkin lymphoma (HL), and non-HL patients was manually annotated as positive or negative for splenomegaly. NLP model training/testing was performed on 1,617/405 reports, and a new validation set of 400 reports from all cancer subtypes was used to test NLP model accuracy, precision, and recall. Overall survival was compared between the patient groups (with and without splenomegaly) using Kaplan-Meier curves.

RESULTS

The final cohort included 3,87,359 reports from 91,665 patients (mean age 60.8 years; 51.2% women). In the testing set, the model achieved accuracy of 92.1%, precision of 92.2%, and recall of 92.1% for splenomegaly. In the validation set, accuracy, precision, and recall were 93.8%, 92.9%, and 86.7%, respectively. In the entire cohort, splenomegaly was most frequent in patients with leukemia (32.5%), HB (17.4%), non-HL (9.1%), colorectal cancer (8.5%), and HL (5.6%). A splenomegaly label was associated with an increased risk of mortality in the entire cohort (hazard ratio 2.10; 95% CI, 1.98 to 2.22; < .001).

CONCLUSION

Automated splenomegaly labeling by NLP of radiology report demonstrates good accuracy, precision, and recall. Splenomegaly is most frequently reported in patients with leukemia, followed by patients with HB.

摘要

目的

评估自然语言处理(NLP)模型在提取癌症患者计算机断层扫描放射学报告中描述的脾肿大的准确性。

方法

在这项回顾性研究中,纳入了 2009 年 7 月至 2019 年 4 月期间的 91665 名患者的 387359 份连续的胸部、腹部和骨盆计算机断层扫描的结构化放射学报告,涵盖了 30 种癌症。从结直肠癌、肝胆癌(HB)、白血病、霍奇金淋巴瘤(HL)和非 HL 患者中随机抽取 2022 份报告,对脾肿大进行阳性或阴性的手动标注。对 1617/405 份报告进行 NLP 模型的训练/测试,并使用来自所有癌症亚型的 400 份新的验证集来测试 NLP 模型的准确性、精度和召回率。使用 Kaplan-Meier 曲线比较有/无脾肿大的患者组之间的总生存率。

结果

最终队列包括来自 91665 名患者的 387359 份报告(平均年龄 60.8 岁;51.2%为女性)。在测试集中,该模型在脾肿大的检测中实现了 92.1%的准确性、92.2%的精度和 92.1%的召回率。在验证集中,准确性、精度和召回率分别为 93.8%、92.9%和 86.7%。在整个队列中,脾肿大最常见于白血病患者(32.5%)、HB 患者(17.4%)、非 HL 患者(9.1%)、结直肠癌患者(8.5%)和 HL 患者(5.6%)。在整个队列中,脾肿大的标签与死亡率的增加相关(危险比 2.10;95%置信区间,1.98 至 2.22;<0.001)。

结论

通过放射学报告的 NLP 自动进行脾肿大标记具有良好的准确性、精度和召回率。脾肿大最常发生在白血病患者中,其次是 HB 患者。

相似文献

3
Evaluating the accuracy of lung-RADS score extraction from radiology reports: Manual entry versus natural language processing.
Int J Med Inform. 2024 Nov;191:105580. doi: 10.1016/j.ijmedinf.2024.105580. Epub 2024 Jul 31.
10
Developing a Cancer Digital Twin: Supervised Metastases Detection From Consecutive Structured Radiology Reports.
Front Artif Intell. 2022 Mar 2;5:826402. doi: 10.3389/frai.2022.826402. eCollection 2022.

引用本文的文献

1
Erythroid progenitor cell-mediated spleen-tumor interaction deteriorates cancer immunity.
Proc Natl Acad Sci U S A. 2025 Mar 4;122(9):e2417473122. doi: 10.1073/pnas.2417473122. Epub 2025 Feb 27.
2
Automated MRI pituitary structured reporting from free-text using a fine-tuned Llama model: a feasibility study.
Jpn J Radiol. 2025 May;43(5):770-778. doi: 10.1007/s11604-024-01721-1. Epub 2024 Dec 28.
3
Use of Natural Language Processing to Infer Sites of Metastatic Disease From Radiology Reports at Scale.
JCO Clin Cancer Inform. 2024 May;8:e2300122. doi: 10.1200/CCI.23.00122.

本文引用的文献

1
Identification of patients with carotid stenosis using natural language processing.
Eur Radiol. 2020 Jul;30(7):4125-4133. doi: 10.1007/s00330-020-06721-z. Epub 2020 Feb 26.
2
Deep Learning for Natural Language Processing in Radiology-Fundamentals and a Systematic Review.
J Am Coll Radiol. 2020 May;17(5):639-648. doi: 10.1016/j.jacr.2019.12.026. Epub 2020 Jan 28.
3
Splenomegaly impacts prognosis in essential thrombocythemia and polycythemia vera: A single center study.
Hematol Rep. 2019 Dec 4;11(4):8281. doi: 10.4081/hr.2019.8281. eCollection 2019 Nov 29.
4
Natural Language Processing for the Identification of Surgical Site Infections in Orthopaedics.
J Bone Joint Surg Am. 2019 Dec 18;101(24):2167-2174. doi: 10.2106/JBJS.19.00661.
7
Splenomegaly - Diagnostic validity, work-up, and underlying causes.
PLoS One. 2017 Nov 14;12(11):e0186674. doi: 10.1371/journal.pone.0186674. eCollection 2017.
9
The Best Single Measurement for Assessing Splenomegaly in Patients with Cirrhotic Liver Morphology.
Acad Radiol. 2017 Dec;24(12):1510-1516. doi: 10.1016/j.acra.2017.06.006. Epub 2017 Aug 8.
10
Natural Language Processing in Oncology: A Review.
JAMA Oncol. 2016 Jun 1;2(6):797-804. doi: 10.1001/jamaoncol.2016.0213.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验