RADEX：一种基于规则的临床和放射学数据提取工具，在甲状腺超声报告中得到验证。

RADEX: a rule-based clinical and radiology data extraction tool demonstrated on thyroid ultrasound reports.

作者信息

Howell Lewis, Zarei Amir, Wah Tze Min, Chandler James H, Karthik Shishir, Court Zara, Ng Helen, McLaughlan James R

机构信息

School of Computing, University of Leeds, Leeds, LS2 9JT, UK.

School of Electronic and Electrical Engineering, University of Leeds, Leeds, LS2 9JT, UK.

出版信息

Eur Radiol. 2025 Feb 13. doi: 10.1007/s00330-025-11416-4.

DOI:10.1007/s00330-025-11416-4

PMID:39945809

Abstract

OBJECTIVES

Radiology reports contain valuable information for research and audits, but relevant details are often buried within free-text fields. This makes them challenging and time-consuming to extract for secondary analyses, including training artificial intelligence (AI) models.

MATERIALS AND METHODS

This study presents a rule-based RAdiology Data EXtraction tool (RADEX) to enable biomedical researchers and healthcare professionals to automate information extraction from clinical documents. RADEX simplifies the translation of domain expertise into regular-expression models, enabling context-dependent searching without specialist expertise in Natural Language Processing. Its utility was demonstrated in the multi-label classification of fourteen clinical features in a large retrospective dataset (n = 16,246) of thyroid ultrasound reports from five hospitals in the United Kingdom (UK). A tuning subset (n = 200) was used to iteratively develop the search strategy, and a holdout test subset (n = 202) was used to evaluate the performance against reference-standard labels.

RESULTS

The dataset cardinality was 3.06, and the label density was 0.34. Cohen's Kappa was 0.94 for rater 1 and 0.95 for rater 2. For RADEX, micro-average sensitivity, specificity, and F1-score were 0.97, 0.96, and 0.94, respectively. The processing time was 12.3 milliseconds per report, enabling fast and reliable information extraction.

CONCLUSION

RADEX is a versatile tool for bespoke research and audit applications, where access to labelled data or computing infrastructure is limited, or explainability and reproducibility are priorities. This offers a time-saving and freely available option to accelerate structured data collection, enabling new insights and improved patient care.

KEY POINTS

Question Radiology reports contain vital information that is buried in unstructured free-text fields. Can we extract this information effectively for research and audit applications? Findings A rule-based RAdiology Data Extraction tool (RADEX) is described and used to classify fourteen key findings from thyroid ultrasound reports with sensitivity and specificity > 0.95. Clinical relevance RADEX offers clinicians and researchers a time-saving tool to accelerate structured data collection. This practical approach prioritises transparency, repeatability, and usability, enabling new insights into improved patient care.

摘要

目的

放射学报告包含对研究和审计有价值的信息，但相关细节往往隐藏在自由文本字段中。这使得对其进行二次分析（包括训练人工智能模型）具有挑战性且耗时。

材料与方法

本研究提出了一种基于规则的放射学数据提取工具（RADEX），以使生物医学研究人员和医疗保健专业人员能够自动从临床文档中提取信息。RADEX简化了将领域专业知识转化为正则表达式模型的过程，无需自然语言处理方面的专业知识即可进行上下文相关搜索。其效用在来自英国五家医院的大型回顾性甲状腺超声报告数据集（n = 16,246）中对14种临床特征的多标签分类中得到了证明。一个调谐子集（n = 200）用于迭代开发搜索策略，一个保留测试子集（n = 202）用于根据参考标准标签评估性能。

结果

数据集基数为3.06，标签密度为0.34。评分者1的Cohen's Kappa为0.94，评分者2的为0.95。对于RADEX，微观平均灵敏度、特异性和F1分数分别为0.97、0.96和0.94。处理时间为每份报告12.3毫秒，能够实现快速可靠的信息提取。

结论

RADEX是一种适用于定制研究和审计应用的通用工具，适用于标记数据或计算基础设施有限，或可解释性和可重复性为优先考虑的情况。这提供了一种节省时间且免费的选项来加速结构化数据收集，从而获得新的见解并改善患者护理。

关键点

问题放射学报告包含埋藏在非结构化自由文本字段中的重要信息。我们能否有效地提取这些信息用于研究和审计应用？发现描述了一种基于规则的放射学数据提取工具（RADEX），并用于对甲状腺超声报告中的14项关键发现进行分类，灵敏度和特异性均> 0.95。临床相关性 RADEX为临床医生和研究人员提供了一种节省时间的工具来加速结构化数据收集。这种实用方法优先考虑透明度、可重复性和可用性，从而能够对改善患者护理获得新的见解。

相似文献

RADEX: a rule-based clinical and radiology data extraction tool demonstrated on thyroid ultrasound reports.RADEX：一种基于规则的临床和放射学数据提取工具，在甲状腺超声报告中得到验证。

Eur Radiol. 2025 Feb 13. doi: 10.1007/s00330-025-11416-4.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中，如果患者出现以下症状和体征，可判断其是否患有 COVID-19。

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

Health professionals' experience of teamwork education in acute hospital settings: a systematic review of qualitative literature.医疗专业人员在急症医院环境中团队合作教育的经验：对定性文献的系统综述

JBI Database System Rev Implement Rep. 2016 Apr;14(4):96-137. doi: 10.11124/JBISRIR-2016-1843.

Intraoperative frozen section analysis for the diagnosis of early stage ovarian cancer in suspicious pelvic masses.术中冰冻切片分析用于诊断可疑盆腔肿块中的早期卵巢癌。

Cochrane Database Syst Rev. 2016 Mar 1;3(3):CD010360. doi: 10.1002/14651858.CD010360.pub2.

Magnetic resonance perfusion for differentiating low-grade from high-grade gliomas at first presentation.首次就诊时磁共振灌注成像用于鉴别低级别与高级别胶质瘤

Cochrane Database Syst Rev. 2018 Jan 22;1(1):CD011551. doi: 10.1002/14651858.CD011551.pub2.

A rapid and systematic review of the clinical effectiveness and cost-effectiveness of topotecan for ovarian cancer.拓扑替康治疗卵巢癌的临床有效性和成本效益的快速系统评价。

Health Technol Assess. 2001;5(28):1-110. doi: 10.3310/hta5280.

Home treatment for mental health problems: a systematic review.心理健康问题的居家治疗：一项系统综述

Health Technol Assess. 2001;5(15):1-139. doi: 10.3310/hta5150.

Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.利用预后信息为乳腺癌患者选择辅助性全身治疗的成本效益

Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.

Eliciting adverse effects data from participants in clinical trials.从临床试验参与者中获取不良反应数据。

Cochrane Database Syst Rev. 2018 Jan 16;1(1):MR000039. doi: 10.1002/14651858.MR000039.pub2.

Dressings and topical agents for treating venous leg ulcers.用于治疗下肢静脉溃疡的敷料和外用剂。

Cochrane Database Syst Rev. 2018 Jun 15;6(6):CD012583. doi: 10.1002/14651858.CD012583.pub2.

本文引用的文献

A Systematic Review of Natural Language Processing Methods and Applications in Thyroidology.甲状腺学中自然语言处理方法与应用的系统评价

Mayo Clin Proc Digit Health. 2024 Jun;2(2):270-279. doi: 10.1016/j.mcpdig.2024.03.007. Epub 2024 May 21.

Thyroid Ultrasound Appropriateness Identification Through Natural Language Processing of Electronic Health Records.通过电子健康记录的自然语言处理进行甲状腺超声检查适宜性识别

Mayo Clin Proc Digit Health. 2024 Mar;2(1):67-74. doi: 10.1016/j.mcpdig.2024.01.001. Epub 2024 Feb 1.

Challenges and barriers of using large language models (LLM) such as ChatGPT for diagnostic medicine with a focus on digital pathology - a recent scoping review.使用大型语言模型（如 ChatGPT）进行诊断医学的挑战和障碍，重点是数字病理学——近期的范围综述。

Diagn Pathol. 2024 Feb 27;19(1):43. doi: 10.1186/s13000-024-01464-7.

Metrics reloaded: recommendations for image analysis validation.重新加载指标：图像分析验证的建议。

Nat Methods. 2024 Feb;21(2):195-212. doi: 10.1038/s41592-023-02151-z. Epub 2024 Feb 12.

Extracting Thyroid Nodules Characteristics from Ultrasound Reports Using Transformer-based Natural Language Processing Methods.基于 Transformer 的自然语言处理方法从超声报告中提取甲状腺结节特征。

AMIA Annu Symp Proc. 2024 Jan 11;2023:1193-1200. eCollection 2023.

Applications of the Natural Language Processing Tool ChatGPT in Clinical Practice: Comparative Study and Augmented Systematic Review.自然语言处理工具ChatGPT在临床实践中的应用：比较研究与增强型系统评价

JMIR Med Inform. 2023 Nov 28;11:e48933. doi: 10.2196/48933.

From data to insights: how natural language processing and structured reporting advance data-driven radiology.从数据到洞察：自然语言处理与结构化报告如何推动数据驱动的放射学发展

Eur Radiol. 2023 Nov;33(11):7494-7495. doi: 10.1007/s00330-023-10242-w. Epub 2023 Oct 2.

A survey on clinical natural language processing in the United Kingdom from 2007 to 2022.2007年至2022年英国临床自然语言处理调查。

NPJ Digit Med. 2022 Dec 21;5(1):186. doi: 10.1038/s41746-022-00730-6.

RadBERT: Adapting Transformer-based Language Models to Radiology.RadBERT：使基于Transformer的语言模型适用于放射学领域。

Radiol Artif Intell. 2022 Jun 15;4(4):e210258. doi: 10.1148/ryai.210258. eCollection 2022 Jul.

Applications of natural language processing in radiology: A systematic review.自然语言处理在放射学中的应用：一项系统综述。

Int J Med Inform. 2022 Jul;163:104779. doi: 10.1016/j.ijmedinf.2022.104779. Epub 2022 Apr 26.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

RADEX：一种基于规则的临床和放射学数据提取工具，在甲状腺超声报告中得到验证。

RADEX: a rule-based clinical and radiology data extraction tool demonstrated on thyroid ultrasound reports.

作者信息

机构信息

出版信息

OBJECTIVES

MATERIALS AND METHODS

RESULTS

CONCLUSION

KEY POINTS

目的

材料与方法

结果

结论

关键点

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献