Zeng Qing T, Goryachev Sergey, Weiss Scott, Sordo Margarita, Murphy Shawn N, Lazarus Ross
Decision Systems Group, Brigham and Women's Hospital, Boston, MA, USA.
BMC Med Inform Decis Mak. 2006 Jul 26;6:30. doi: 10.1186/1472-6947-6-30.
The text descriptions in electronic medical records are a rich source of information. We have developed a Health Information Text Extraction (HITEx) tool and used it to extract key findings for a research study on airways disease.
The principal diagnosis, co-morbidity and smoking status extracted by HITEx from a set of 150 discharge summaries were compared to an expert-generated gold standard.
The accuracy of HITEx was 82% for principal diagnosis, 87% for co-morbidity, and 90% for smoking status extraction, when cases labeled "Insufficient Data" by the gold standard were excluded.
We consider the results promising, given the complexity of the discharge summaries and the extraction tasks.
电子病历中的文本描述是丰富的信息来源。我们开发了一种健康信息文本提取(HITEx)工具,并将其用于提取一项关于气道疾病的研究中的关键发现。
将HITEx从150份出院小结中提取的主要诊断、合并症和吸烟状况与专家生成的金标准进行比较。
当排除金标准标记为“数据不足”的病例时,HITEx在主要诊断提取方面的准确率为82%,合并症提取方面为87%,吸烟状况提取方面为90%。
考虑到出院小结和提取任务的复杂性,我们认为结果很有前景。