Lung Center, Cantonal Hospital St. Gallen, Rorschacherstrasse 95, St. Gallen, 9007, Switzerland.
Division of General Internal Medicine, Cantonal Hospital St. Gallen, Rorschacherstrasse 95, St. Gallen, 9007, Switzerland.
J Biomed Semantics. 2022 Jan 31;13(1):5. doi: 10.1186/s13326-022-00259-3.
Text mining can be applied to automate knowledge extraction from unstructured data included in medical reports and generate quality indicators applicable for medical documentation. The primary objective of this study was to apply text mining methodology for the analysis of polysomnographic medical reports in order to quantify sources of variation - here the diagnostic precision vs. the inter-rater variability - in the work-up of sleep-disordered breathing. The secondary objective was to assess the impact of a text block standardization on the diagnostic precision of polysomnography reports in an independent test set.
Polysomnography reports of 243 laboratory-based overnight sleep investigations scored by 9 trained sleep specialists of the Sleep Center St. Gallen were analyzed using a text-mining methodology. Patterns in the usage of discriminating terms allowed for the characterization of type and severity of disease and inter-rater homogeneity. The variation introduced by the inter-rater (technician/physician) heterogeneity was found to be twice as high compared to the variation introduced by effective diagnostic information. A simple text block standardization could significantly reduce the inter-rater variability by 44%, enhance the predictive value and ultimately improve the diagnostic accuracy of polysomnography reports.
Text mining was successfully used to assess and optimize the quality, as well as the precision and homogeneity of medical reporting of diagnostic procedures - here exemplified with sleep studies. Text mining methodology could lay the ground for objective and systematic qualitative assessment of medical reports.
文本挖掘可应用于从医学报告中包含的非结构化数据中自动提取知识,并生成适用于医疗文档的质量指标。本研究的主要目的是应用文本挖掘方法分析多导睡眠图医学报告,以量化睡眠呼吸障碍评估中变异的来源——这里是诊断精度与观察者间变异性的关系。次要目标是评估文本块标准化对独立测试集中多导睡眠图报告诊断精度的影响。
使用文本挖掘方法分析了圣加仑睡眠中心 9 位经过培训的睡眠专家对 243 例基于实验室的夜间睡眠研究的多导睡眠图报告。使用鉴别术语的模式可对疾病的类型和严重程度以及观察者间的同质性进行特征描述。与有效诊断信息引入的变异性相比,观察者间(技师/医生)异质性引入的变异性高两倍。简单的文本块标准化可使观察者间变异性显著降低 44%,提高预测值,并最终提高多导睡眠图报告的诊断准确性。
文本挖掘成功地用于评估和优化医疗报告的质量以及诊断程序的准确性和同质性——这里以睡眠研究为例。文本挖掘方法可为医疗报告的客观和系统定性评估奠定基础。