Liang Chen, Gong Yang
Louisiana Tech University, Ruston, Louisiana, USA.
School of Biomedical Informatics, University of Texas Health Science Center, Houston, Texas, USA.
Stud Health Technol Inform. 2017;245:1070-1074.
Over the past two decades, there have seen an ever-increasing amount of patient safety reports yet the capacity of extracting useful information from the reports remains limited. Classification of patient safety reports is the first step of performing a downstream analysis. In practice, the manual review processes for classification are labor-intense. Studies have shown that the reports are often mislabeled or unclassifiable based on the pre-defined categories, which presents a notable data quality problem. In this study, we investigated the multi-labeled nature of patient safety reports. We argue that understanding multi-labeled nature of reports is a key to disclose the complex relations between many components during the courses and development of medical errors. Accordingly, we developed automated multi-label text classifiers to process patient safety reports. The experiments demonstrated feasibility and efficiency of a combination of multi-label algorithms in the benchmark comparison. Grounded on our experiments and results, we provided suggestions on how to implement automated classification of patient safety reports in the clinical settings.
在过去二十年中,患者安全报告的数量不断增加,但从这些报告中提取有用信息的能力仍然有限。患者安全报告的分类是进行下游分析的第一步。在实践中,用于分类的人工审核过程劳动强度大。研究表明,根据预定义类别,这些报告常常被错误标记或无法分类,这是一个显著的数据质量问题。在本研究中,我们调查了患者安全报告的多标签性质。我们认为,理解报告的多标签性质是揭示医疗差错过程和发展中许多组成部分之间复杂关系的关键。因此,我们开发了自动多标签文本分类器来处理患者安全报告。实验证明了多标签算法组合在基准比较中的可行性和效率。基于我们的实验和结果,我们就如何在临床环境中实现患者安全报告的自动分类提供了建议。