Ware Henry, Mullett Charles J, Jagannathan V
Medquist, Inc, Morgantown, WV, USA.
J Am Med Inform Assoc. 2009 Jul-Aug;16(4):585-9. doi: 10.1197/jamia.M3091. Epub 2009 Apr 23.
OBJECTIVE The authors developed a natural language processing (NLP) framework that could be used to extract clinical findings and diagnoses from dictated physician documentation. DESIGN De-identified documentation was made available by i2b2 Bio-informatics research group as a part of their NLP challenge focusing on obesity and its co-morbidities. The authors describe their approach, which used a combination of concept detection, context validation, and the application of a variety of rules to conclude patient diagnoses. RESULTS The framework was successful at correctly identifying diagnoses as judged by NLP challenge organizers when compared with a gold standard of physician annotations. The authors overall kappa values for agreement with the gold standard were 0.92 for explicit textual results and 0.91 for intuited results. The NLP framework compared favorably with those of the other entrants, placing third in textual results and fourth in intuited results in the i2b2 competition. CONCLUSIONS The framework and approach used to detect clinical conditions was reasonably successful at extracting 16 diagnoses related to obesity. The system and methodology merits further development, targeting clinically useful applications.
目的 作者开发了一种自然语言处理(NLP)框架,可用于从医生口述文档中提取临床发现和诊断信息。
设计 去识别化文档由i2b2生物信息学研究小组提供,作为其聚焦肥胖及其合并症的NLP挑战的一部分。作者描述了他们的方法,该方法结合了概念检测、上下文验证以及应用各种规则来得出患者诊断结果。
结果 与医生注释的金标准相比,经NLP挑战组织者判断,该框架在正确识别诊断方面取得了成功。作者与金标准的总体kappa值,明确文本结果为0.92,直观结果为0.91。在i2b2竞赛中,该NLP框架与其他参赛者的框架相比表现良好,在文本结果中排名第三,在直观结果中排名第四。
结论 用于检测临床状况的框架和方法在提取16种与肥胖相关的诊断方面相当成功。该系统和方法值得进一步开发,以针对临床有用的应用。