Suppr超能文献

来自美国国家健康访谈调查的伤害叙述数据的计算机编码。

Computerized coding of injury narrative data from the National Health Interview Survey.

作者信息

Wellman Helen M, Lehto Mark R, Sorock Gary S, Smith Gordon S

机构信息

Liberty Mutual Research Institute for Safety, 71 Frankland Road, Hopkinton, MA 01748, USA.

出版信息

Accid Anal Prev. 2004 Mar;36(2):165-71. doi: 10.1016/s0001-4575(02)00146-x.

Abstract

OBJECTIVE

To investigate the accuracy of a computerized method for classifying injury narratives into external-cause-of-injury and poisoning (E-code) categories.

METHODS

This study used injury narratives and corresponding E-codes assigned by experts from the 1997 and 1998 US National Health Interview Survey (NHIS). A Fuzzy Bayesian model was used to assign injury descriptions to 13 E-code categories. Sensitivity, specificity and positive predictive value were measured by comparing the computer generated codes with E-code categories assigned by experts.

RESULTS

The computer program correctly classified 4695 (82.7%) of the 5677 injury narratives when multiple words were included as keywords in the model. The use of multiple-word predictors compared with using single words alone improved both the sensitivity and specificity of the computer generated codes. The program is capable of identifying and filtering out cases that would benefit most from manual coding. For example, the program could be used to code the narrative if the maximum probability of a category given the keywords in the narrative was at least 0.9. If the maximum probability was lower than 0.9 (which will be the case for approximately 33% of the narratives) the case would be filtered out for manual review.

CONCLUSIONS

A computer program based on Fuzzy Bayes logic is capable of accurately categorizing cause-of-injury codes from injury narratives. The capacity to filter out certain cases for manual coding improves the utility of this process.

摘要

目的

探讨一种将损伤描述分类为损伤外因和中毒(E编码)类别的计算机方法的准确性。

方法

本研究使用了1997年和1998年美国国家健康访谈调查(NHIS)专家分配的损伤描述及相应的E编码。采用模糊贝叶斯模型将损伤描述分配到13个E编码类别中。通过将计算机生成的编码与专家分配的E编码类别进行比较,来测量敏感性、特异性和阳性预测值。

结果

当模型中包含多个单词作为关键词时,计算机程序正确分类了5677条损伤描述中的4695条(82.7%)。与仅使用单个单词相比,使用多个单词预测器提高了计算机生成编码的敏感性和特异性。该程序能够识别并筛选出最适合人工编码的案例。例如,如果给定损伤描述关键词的类别最大概率至少为0.9,则该程序可用于对损伤描述进行编码。如果最大概率低于0.9(约33%的损伤描述会出现这种情况),则该案例将被筛选出来进行人工审核。

结论

基于模糊贝叶斯逻辑的计算机程序能够准确地将损伤描述中的损伤原因编码进行分类。筛选出某些案例进行人工编码的能力提高了这一过程的实用性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验