Denny Joshua C, Miller Randolph A, Waitman Lemuel Russell, Arrieta Mark A, Peterson Joshua F
Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA.
Int J Med Inform. 2009 Apr;78 Suppl 1(Suppl 1):S34-42. doi: 10.1016/j.ijmedinf.2008.09.001. Epub 2008 Oct 19.
Typically detected via electrocardiograms (ECGs), QT interval prolongation is a known risk factor for sudden cardiac death. Since medications can promote or exacerbate the condition, detection of QT interval prolongation is important for clinical decision support. We investigated the accuracy of natural language processing (NLP) for identifying QT prolongation from cardiologist-generated, free-text ECG impressions compared to corrected QT (QTc) thresholds reported by ECG machines.
After integrating negation detection to a locally developed natural language processor, the KnowledgeMap concept identifier, we evaluated NLP-based detection of QT prolongation compared to the calculated QTc on a set of 44,318 ECGs obtained from hospitalized patients. We also created a string query using regular expressions to identify QT prolongation. We calculated sensitivity and specificity of the methods using manual physician review of the cardiologist-generated reports as the gold standard. To investigate causes of "false positive" calculated QTc, we manually reviewed randomly selected ECGs with a long calculated QTc but no mention of QT prolongation. Separately, we validated the performance of the negation detection algorithm on 5000 manually categorized ECG phrases for any medical concept (not limited to QT prolongation) prior to developing the NLP query for QT prolongation.
The NLP query for QT prolongation correctly identified 2364 of 2373 ECGs with QT prolongation with a sensitivity of 0.996 and a positive predictive value of 1.000. There were no false positives. The regular expression query had a sensitivity of 0.999 and positive predictive value of 0.982. In contrast, the positive predictive value of common QTc thresholds derived from ECG machines was 0.07-0.25 with corresponding sensitivities of 0.994-0.046. The negation detection algorithm had a recall of 0.973 and precision of 0.982 for 10,490 concepts found within ECG impressions.
NLP and regular expression queries of cardiologists' ECG interpretations can more effectively identify QT prolongation than the automated QTc intervals reported by ECG machines. Future clinical decision support could employ NLP queries to detect QTc prolongation and other reported ECG abnormalities.
QT间期延长通常通过心电图(ECG)检测,是已知的心脏性猝死风险因素。由于药物可促使或加重该情况,检测QT间期延长对临床决策支持很重要。我们研究了自然语言处理(NLP)从心脏病专家生成的自由文本心电图诊断中识别QT延长的准确性,并与心电图机器报告的校正QT(QTc)阈值进行比较。
在将否定检测集成到本地开发的自然语言处理器KnowledgeMap概念标识符后,我们在一组从住院患者获得的44318份心电图上,评估了基于NLP的QT延长检测与计算得到的QTc的比较情况。我们还使用正则表达式创建了一个字符串查询来识别QT延长。我们以心脏病专家生成的报告经医生人工审核作为金标准,计算了这些方法的敏感性和特异性。为了调查计算得到的QTc“假阳性”的原因,我们人工审核了随机选择的QTc计算值长但未提及QT延长的心电图。另外,在开发用于QT延长的NLP查询之前,我们在5000条人工分类的心电图短语(不限于QT延长)上验证了否定检测算法对任何医学概念的性能。
用于QT延长的NLP查询在2373份有QT延长的心电图中正确识别出2364份,敏感性为0.996,阳性预测值为1.000。无假阳性。正则表达式查询的敏感性为0.999,阳性预测值为0.982。相比之下,心电图机器得出的常见QTc阈值的阳性预测值为0.07 - 0.25,相应的敏感性为0.994 - 0.046。否定检测算法对在心电图诊断中发现的10490个概念的召回率为0.973,精确率为0.982。
与心电图机器报告的自动QTc间期相比,对心脏病专家心电图解读进行NLP和正则表达式查询能更有效地识别QT延长。未来的临床决策支持可采用NLP查询来检测QTc延长及其他报告的心电图异常情况。