Ambert Kyle H, Cohen Aaron M
Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR, USA.
J Am Med Inform Assoc. 2009 Jul-Aug;16(4):590-5. doi: 10.1197/jamia.M3095. Epub 2009 Apr 23.
OBJECTIVE Free-text clinical reports serve as an important part of patient care management and clinical documentation of patient disease and treatment status. Free-text notes are commonplace in medical practice, but remain an under-used source of information for clinical and epidemiological research, as well as personalized medicine. The authors explore the challenges associated with automatically extracting information from clinical reports using their submission to the Integrating Informatics with Biology and the Bedside (i2b2) 2008 Natural Language Processing Obesity Challenge Task. DESIGN A text mining system for classifying patient comorbidity status, based on the information contained in clinical reports. The approach of the authors incorporates a variety of automated techniques, including hot-spot filtering, negated concept identification, zero-vector filtering, weighting by inverse class-frequency, and error-correcting of output codes with linear support vector machines. MEASUREMENTS Performance was evaluated in terms of the macroaveraged F1 measure. RESULTS The automated system performed well against manual expert rule-based systems, finishing fifth in the Challenge's intuitive task, and 13(th) in the textual task. CONCLUSIONS The system demonstrates that effective comorbidity status classification by an automated system is possible.
目的 自由文本临床报告是患者护理管理以及患者疾病与治疗状态临床记录的重要组成部分。自由文本记录在医疗实践中很常见,但对于临床和流行病学研究以及个性化医疗而言,仍然是一种未得到充分利用的信息来源。作者们通过提交给“整合生物学与床边信息学(i2b2)2008自然语言处理肥胖挑战任务”,探讨了从临床报告中自动提取信息所面临的挑战。
设计 基于临床报告中包含的信息,设计一个用于对患者合并症状态进行分类的文本挖掘系统。作者们的方法采用了多种自动化技术,包括热点过滤、否定概念识别、零向量过滤、按逆类频率加权以及使用线性支持向量机对输出代码进行纠错。
测量 依据宏平均F1测量值对性能进行评估。
结果 该自动化系统与基于人工专家规则的系统相比表现良好,在挑战赛的直观任务中排名第五,在文本任务中排名第13。
结论 该系统表明,自动化系统有可能有效地对合并症状态进行分类。