Department of Radiology, University Hospital Basel, Petersgraben 4, 4031, Basel, Switzerland.
Eur Radiol. 2023 Nov;33(11):7496-7506. doi: 10.1007/s00330-023-10050-2. Epub 2023 Aug 5.
To investigate how a transition from free text to structured reporting affects reporting language with regard to standardization and distinguishability.
A total of 747,393 radiology reports dictated between January 2011 and June 2020 were retrospectively analyzed. The body and cardiothoracic imaging divisions introduced a reporting concept using standardized language and structured reporting templates in January 2016. Reports were segmented by a natural language processing algorithm and converted into a 20-dimension document vector. For analysis, dimensionality was reduced to a 2D visualization with t-distributed stochastic neighbor embedding and matched with metadata. Linguistic standardization was assessed by comparing distinct report types' vector spreads (e.g., run-off MR angiography) between reporting standards. Changes in report type distinguishability (e.g., CT abdomen/pelvis vs. MR abdomen) were measured by comparing the distance between their centroids.
Structured reports showed lower document vector spread (thus higher linguistic similarity) compared with free-text reports overall (21.9 [free-text] vs. 15.9 [structured]; - 27.4%; p < 0.001) and for most report types, e.g., run-off MR angiography (15.2 vs. 1.8; - 88.2%; p < 0.001) or double-rule-out CT (26.8 vs. 10.0; - 62.7%; p < 0.001). No changes were observed for reports continued to be written in free text, e.g., CT head reports (33.2 vs. 33.1; - 0.3%; p = 1). Distances between the report types' centroids increased with structured reporting (thus better linguistic distinguishability) overall (27.3 vs. 54.4; + 99.3 ± 98.4%) and for specific report types, e.g., CT abdomen/pelvis vs. MR abdomen (13.7 vs. 37.2; + 171.5%).
Structured reporting and the use of factual language yield more homogenous and standardized radiology reports on a linguistic level, tailored to specific reporting scenarios and imaging studies.
Information transmission to referring physicians, as well as automated report assessment and content extraction in big data analyses, may benefit from standardized reporting, due to consistent report organization and terminology used for pathologies and normal findings.
• Natural language processing and t-distributed stochastic neighbor embedding can transform radiology reports into numeric vectors, allowing the quantification of their linguistic standardization. • Structured reporting substantially increases reports' linguistic standardization (mean: - 27.4% in vector spread) and distinguishability (mean: + 99.3 ± 98.4% increase in vector distance) compared with free-text reports. • Higher standardization and homogeneity outline potential benefits of structured reporting for information transmission and big data analyses.
研究从自由文本到结构化报告的转变如何影响标准化和可区分性方面的报告语言。
回顾性分析了 2011 年 1 月至 2020 年 6 月期间共 747393 次放射学报告。2016 年 1 月,身体和心胸成像部门引入了一种使用标准化语言和结构化报告模板的报告概念。报告通过自然语言处理算法进行分割,并转换为 20 维文档向量。为了进行分析,通过 t 分布随机邻居嵌入将维度降低到二维可视化,并与元数据匹配。通过比较报告类型之间的向量扩散(例如,MR 血管造影后)来评估语言标准化(例如,运行 MR 血管造影)。通过比较它们质心之间的距离来衡量报告类型的可区分性变化(例如,CT 腹部/骨盆与 MR 腹部)。
与自由文本报告相比,结构化报告总体上显示出较低的文档向量扩散(因此语言相似度更高)(21.9 [自由文本]与 15.9 [结构化];-27.4%;p<0.001),并且对于大多数报告类型,例如 MR 血管造影后(15.2 与 1.8;-88.2%;p<0.001)或双排除 CT(26.8 与 10.0;-62.7%;p<0.001)。对于继续以自由文本形式书写的报告,例如 CT 头部报告,没有观察到变化(33.2 与 33.1;-0.3%;p=1)。报告类型质心之间的距离随着结构化报告的使用而增加(因此语言可区分性更好)(27.3 与 54.4;+99.3±98.4%),对于特定的报告类型,例如 CT 腹部/骨盆与 MR 腹部(13.7 与 37.2;+171.5%)。
结构化报告和使用事实语言在语言层面上产生更同质和标准化的放射学报告,针对特定的报告场景和成像研究进行了调整。
由于用于病理学和正常发现的一致报告组织和术语,标准化报告可能有助于向参考医师传输信息,以及在大数据分析中进行自动报告评估和内容提取。
•自然语言处理和 t 分布随机邻居嵌入可以将放射学报告转换为数字向量,从而可以量化其语言标准化。•与自由文本报告相比,结构化报告在语言标准化方面有了显著提高(平均减少 27.4%的向量扩散),并且可区分性提高(平均增加 99.3±98.4%的向量距离)。•更高的标准化和同质性概述了结构化报告在信息传输和大数据分析方面的潜在优势。