Suppr
超能文献

利用大语言模型开发提示，以从乳腺癌的病理学和超声报告中提取临床信息。

Developing prompts from large language model for extracting clinical information from pathology and ultrasound reports in breast cancer.

作者信息

Choi Hyeon Seok, Song Jun Yeong, Shin Kyung Hwan, Chang Ji Hyun, Jang Bum-Sup

机构信息

Department of Radiation Oncology, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea.

Institute of Radiation Medicine, Seoul National University Medical Research Center, Seoul, Korea.

出版信息

Radiat Oncol J. 2023 Sep;41(3):209-216. doi: 10.3857/roj.2023.00633. Epub 2023 Sep 21.

DOI:10.3857/roj.2023.00633

PMID:37793630

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10556835/

Abstract

PURPOSE

We aimed to evaluate the time and cost of developing prompts using large language model (LLM), tailored to extract clinical factors in breast cancer patients and their accuracy.

MATERIALS AND METHODS

We collected data from reports of surgical pathology and ultrasound from breast cancer patients who underwent radiotherapy from 2020 to 2022. We extracted the information using the Generative Pre-trained Transformer (GPT) for Sheets and Docs extension plugin and termed this the "LLM" method. The time and cost of developing the prompts with LLM methods were assessed and compared with those spent on collecting information with "full manual" and "LLM-assisted manual" methods. To assess accuracy, 340 patients were randomly selected, and the extracted information by LLM method were compared with those collected by "full manual" method.

RESULTS

Data from 2,931 patients were collected. We developed 12 prompts for Extract function and 12 for Format function to extract and standardize the information. The overall accuracy was 87.7%. For lymphovascular invasion, it was 98.2%. Developing and processing the prompts took 3.5 hours and 15 minutes, respectively. Utilizing the ChatGPT application programming interface cost US $65.8 and when factoring in the estimated wage, the total cost was US $95.4. In an estimated comparison, "LLM-assisted manual" and "LLM" methods were time- and cost-efficient compared to the "full manual" method.

CONCLUSION

Developing and facilitating prompts for LLM to derive clinical factors was efficient to extract crucial information from huge medical records. This study demonstrated the potential of the application of natural language processing using LLM model in breast cancer patients. Prompts from the current study can be re-used for other research to collect clinical information.

摘要

目的

我们旨在评估使用大语言模型（LLM）开发提示以提取乳腺癌患者临床因素的时间、成本及其准确性。

材料与方法

我们收集了2020年至2022年接受放疗的乳腺癌患者的手术病理报告和超声报告数据。我们使用适用于表格和文档的生成式预训练变换器（GPT）扩展插件提取信息，并将此方法称为“LLM”方法。评估了使用LLM方法开发提示的时间和成本，并与“完全手动”和“LLM辅助手动”方法收集信息所花费的时间和成本进行了比较。为评估准确性，随机选择了340例患者，并将LLM方法提取的信息与“完全手动”方法收集的信息进行比较。

结果

收集了2931例患者的数据。我们为提取功能开发了12个提示词，为格式化功能开发了12个提示词，以提取和规范信息。总体准确率为87.7%。对于脉管侵犯，准确率为98.2%。开发和处理提示词分别耗时3.5小时15分钟。使用ChatGPT应用程序编程接口花费65.8美元，计入估计工资后，总成本为95.4美元。在估计比较中，与“完全手动”方法相比，“LLM辅助手动”和“LLM”方法在时间和成本上更具效率。

结论

开发并促进LLM的提示以获取临床因素，能有效地从大量医疗记录中提取关键信息。本研究证明了使用LLM模型进行自然语言处理在乳腺癌患者中的应用潜力。本研究中的提示词可重新用于其他收集临床信息的研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d8e/10556835/03ddcf9e120b/roj-2023-00633f1.jpg

相似文献

Developing prompts from large language model for extracting clinical information from pathology and ultrasound reports in breast cancer.

Radiat Oncol J. 2023 Sep;41(3):209-216. doi: 10.3857/roj.2023.00633. Epub 2023 Sep 21.

ChatGPT and large language model (LLM) chatbots: The current state of acceptability and a proposal for guidelines on utilization in academic medicine.

J Pediatr Urol. 2023 Oct;19(5):598-604. doi: 10.1016/j.jpurol.2023.05.018. Epub 2023 Jun 2.

Generative large language models are all-purpose text analytics engines: text-to-text learning is all your need.

J Am Med Inform Assoc. 2024 Sep 1;31(9):1892-1903. doi: 10.1093/jamia/ocae078.

Diagnosing Glaucoma Based on the Ocular Hypertension Treatment Study Dataset Using Chat Generative Pre-Trained Transformer as a Large Language Model.

Ophthalmol Sci. 2024 Aug 22;5(1):100599. doi: 10.1016/j.xops.2024.100599. eCollection 2025 Jan-Feb.

From jargon to clarity: Improving the readability of foot and ankle radiology reports with an artificial intelligence large language model.

Foot Ankle Surg. 2024 Jun;30(4):331-337. doi: 10.1016/j.fas.2024.01.008. Epub 2024 Feb 5.

LLM-AIx: An open source pipeline for Information Extraction from unstructured medical text based on privacy preserving Large Language Models.

medRxiv. 2024 Sep 3:2024.09.02.24312917. doi: 10.1101/2024.09.02.24312917.

Validation of large language models for detecting pathologic complete response in breast cancer using population-based pathology reports.

BMC Med Inform Decis Mak. 2024 Oct 3;24(1):283. doi: 10.1186/s12911-024-02677-y.

Use of Generative AI to Identify Helmet Status Among Patients With Micromobility-Related Injuries From Unstructured Clinical Notes.

JAMA Netw Open. 2024 Aug 1;7(8):e2425981. doi: 10.1001/jamanetworkopen.2024.25981.

Extracting structured information from unstructured histopathology reports using generative pre-trained transformer 4 (GPT-4).

J Pathol. 2024 Mar;262(3):310-319. doi: 10.1002/path.6232. Epub 2023 Dec 14.

Large language models: Are artificial intelligence-based chatbots a reliable source of patient information for spinal surgery?

Eur Spine J. 2024 Nov;33(11):4135-4143. doi: 10.1007/s00586-023-07975-z. Epub 2023 Oct 11.

引用本文的文献

Performance of Natural Language Processing for Information Extraction From Electronic Health Records Within Cancer: Systematic Review.

JMIR Med Inform. 2025 Sep 12;13:e68707. doi: 10.2196/68707.

Development and Validation of a Large Language Model-Based System for Medical History-Taking Training: Prospective Multicase Study on Evaluation Stability, Human-AI Consistency, and Transparency.

JMIR Med Educ. 2025 Aug 29;11:e73419. doi: 10.2196/73419.

Incorporating large language models as clinical decision support in oncology: the Woollie model.

NPJ Digit Med. 2025 Aug 18;8(1):529. doi: 10.1038/s41746-025-01941-3.

Development and evaluation of large-language models (LLMs) for oncology: A scoping review.

PLOS Digit Health. 2025 Aug 7;4(8):e0000980. doi: 10.1371/journal.pdig.0000980. eCollection 2025 Aug.

Challenges and opportunities to integrate artificial intelligence in radiation oncology: a narrative review.

Ewha Med J. 2024 Oct;47(4):e49. doi: 10.12771/emj.2024.e49. Epub 2024 Oct 31.

Large language model integrations in cancer decision-making: a systematic review and meta-analysis.

NPJ Digit Med. 2025 Jul 17;8(1):450. doi: 10.1038/s41746-025-01824-7.

Data Extraction and Curation from Radiology Reports for Pancreatic Cyst Surveillance Using Large Language Models.

J Am Coll Surg. 2025 Jul 10. doi: 10.1097/XCS.0000000000001478.

Open-Source Hybrid Large Language Model Integrated System for Extraction of Breast Cancer Treatment Pathway From Free-Text Clinical Notes.

JCO Clin Cancer Inform. 2025 Jun;9:e2500002. doi: 10.1200/CCI-25-00002. Epub 2025 Jun 27.

Celebrating Ulrik Ringborg: Multi-Omics-Based Patient Stratification for Precision Cancer Treatment.

Biomolecules. 2025 May 10;15(5):693. doi: 10.3390/biom15050693.

The influence of prompt engineering on large language models for protein-protein interaction identification in biomedical literature.

Sci Rep. 2025 May 3;15(1):15493. doi: 10.1038/s41598-025-99290-4.

本文引用的文献

Global Mental Health Services and the Impact of Artificial Intelligence-Powered Large Language Models.

JAMA Psychiatry. 2023 Jul 1;80(7):662-664. doi: 10.1001/jamapsychiatry.2023.1253.

Benefits, Limits, and Risks of GPT-4 as an AI Chatbot for Medicine.

N Engl J Med. 2023 Mar 30;388(13):1233-1239. doi: 10.1056/NEJMsr2214184.

Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models.

PLOS Digit Health. 2023 Feb 9;2(2):e0000198. doi: 10.1371/journal.pdig.0000198. eCollection 2023 Feb.

Prognostic impact of postoperative radiotherapy in patients with breast cancer and with pT1-2 and 1-3 lymph node metastases: A retrospective cohort study based on the Japanese Breast Cancer Registry.

Eur J Cancer. 2022 Sep;172:31-40. doi: 10.1016/j.ejca.2022.05.017. Epub 2022 Jun 22.

Changes in the working conditions and learning environment of medical residents after the enactment of the Medical Resident Act in Korea in 2015: a national 4-year longitudinal study.

J Educ Eval Health Prof. 2021;18:7. doi: 10.3352/jeehp.2021.18.7. Epub 2021 Apr 20.

Clinical Natural Language Processing for Radiation Oncology: A Review and Practical Primer.

Int J Radiat Oncol Biol Phys. 2021 Jul 1;110(3):641-655. doi: 10.1016/j.ijrobp.2021.01.044. Epub 2021 Feb 3.

Artificial intelligence approaches using natural language processing to advance EHR-based clinical research.

J Allergy Clin Immunol. 2020 Feb;145(2):463-469. doi: 10.1016/j.jaci.2019.12.897. Epub 2019 Dec 26.

Incorporating Risk Factors to Identify the Indication of Post-mastectomy Radiotherapy in N1 Breast Cancer Treated with Optimal Systemic Therapy: A Multicenter Analysis in Korea (KROG 14-23).

Cancer Res Treat. 2017 Jul;49(3):739-747. doi: 10.4143/crt.2016.405. Epub 2016 Oct 19.

Identification of Risk Factors for Locoregional Recurrence in Breast Cancer Patients with Nodal Stage N0 and N1: Who Could Benefit from Post-Mastectomy Radiotherapy?

PLoS One. 2015 Dec 21;10(12):e0145463. doi: 10.1371/journal.pone.0145463. eCollection 2015.

Patients with N1 breast cancer: who could benefit from supraclavicular fossa radiotherapy?

Breast. 2014 Dec;23(6):749-53. doi: 10.1016/j.breast.2014.08.001. Epub 2014 Sep 16.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

利用大语言模型开发提示，以从乳腺癌的病理学和超声报告中提取临床信息。

Developing prompts from large language model for extracting clinical information from pathology and ultrasound reports in breast cancer.

作者信息

机构信息

出版信息

PURPOSE

MATERIALS AND METHODS

RESULTS

CONCLUSION

目的

材料与方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译