Reichenpfader Daniel, Denecke Kerstin
Bern University of Applied Sciences, Institute for Patient-centered Digital Health, Biel/Bienne, Switzerland.
Stud Health Technol Inform. 2024 Aug 22;316:1669-1673. doi: 10.3233/SHTI240744.
The rapid technical progress in the domain of clinical Natural Language Processing and information extraction (IE) has resulted in challenges concerning the comparability and replicability of studies.
This paper proposes a reporting guideline to standardize the description of methodologies and outcomes for studies involving IE from clinical texts.
The guideline is developed based on the experiences gained from data extraction for a previously conducted scoping review on IE from free-text radiology reports including 34 studies.
The guideline comprises the five top-level categories information model, architecture, data, annotation, and outcomes. In total, we define 28 aspects to be reported on in IE studies related to these categories.
The proposed guideline is expected to set a standard for reporting in studies describing IE from clinical text and promote uniformity across the research field. Expected future technological advancements may make regular updates of the guideline necessary. In future research, we plan to develop a taxonomy that clearly defines corresponding value sets as well as integrating both this guideline and the taxonomy by following a consensus-based methodology.
临床自然语言处理和信息提取(IE)领域的快速技术进步给研究的可比性和可重复性带来了挑战。
本文提出一项报告指南,以规范涉及从临床文本中提取信息的研究方法和结果的描述。
该指南是基于先前对包括34项研究的自由文本放射学报告进行的IE范围审查的数据提取经验而制定的。
该指南包括信息模型、架构、数据、注释和结果这五个顶级类别。我们总共定义了与这些类别相关的IE研究中要报告的28个方面。
拟议的指南有望为描述从临床文本中提取信息的研究报告设定标准,并促进整个研究领域的一致性。未来预期的技术进步可能使该指南有必要定期更新。在未来的研究中,我们计划开发一种分类法,明确界定相应的值集,并通过遵循基于共识的方法将本指南和分类法整合起来。