Department of Biomedical Informatics, Vanderbilt University, Nashville, Tennessee, USA.
J Am Med Inform Assoc. 2010 Jul-Aug;17(4):383-8. doi: 10.1136/jamia.2010.004804.
Colorectal cancer (CRC) screening rates are low despite confirmed benefits. The authors investigated the use of natural language processing (NLP) to identify previous colonoscopy screening in electronic records from a random sample of 200 patients at least 50 years old. The authors developed algorithms to recognize temporal expressions and 'status indicators', such as 'patient refused', or 'test scheduled'. The new methods were added to the existing KnowledgeMap concept identifier system, and the resulting system was used to parse electronic medical records (EMR) to detect completed colonoscopies. Using as the 'gold standard' expert physicians' manual review of EMR notes, the system identified timing references with a recall of 0.91 and precision of 0.95, colonoscopy status indicators with a recall of 0.82 and precision of 0.95, and references to actually completed colonoscopies with recall of 0.93 and precision of 0.95. The system was superior to using colonoscopy billing codes alone. Health services researchers and clinicians may find NLP a useful adjunct to traditional methods to detect CRC screening status. Further investigations must validate extension of NLP approaches for other types of CRC screening applications.
尽管已经证实了其益处,但结直肠癌(CRC)的筛查率仍然很低。作者调查了自然语言处理(NLP)在从至少 50 岁的 200 名随机患者的电子记录中识别以前的结肠镜筛查的使用情况。作者开发了算法来识别时间表达式和“状态指标”,例如“患者拒绝”或“测试安排”。新方法被添加到现有的 KnowledgeMap 概念标识符系统中,并且使用该系统解析电子病历(EMR)以检测已完成的结肠镜检查。使用专家医生对 EMR 记录的手动审查作为“金标准”,该系统识别时间参考的召回率为 0.91,精度为 0.95,结肠镜状态指标的召回率为 0.82,精度为 0.95,以及实际完成的结肠镜检查的参考,召回率为 0.93,精度为 0.95。该系统优于仅使用结肠镜检查计费代码。卫生服务研究人员和临床医生可能会发现 NLP 是一种有用的辅助手段,可用于传统方法来检测 CRC 筛查状态。还需要进一步的调查来验证 NLP 方法在其他类型的 CRC 筛查应用中的扩展。