Kaiser Permanente Southern California, Pasadena, CA, USA.
Kaiser Permanente Southern California, Pasadena, CA, USA.
Int J Med Inform. 2019 Jul;127:27-34. doi: 10.1016/j.ijmedinf.2019.04.009. Epub 2019 Apr 13.
Local reactions are the most common vaccine-related adverse event. There is no specific diagnosis code for local reaction due to vaccination. Previous vaccine safety studies used non-specific diagnosis codes to identify potential local reaction cases and confirmed the cases through manual chart review. In this study, a natural language processing (NLP) algorithm was developed to identify local reaction associated with tetanus-diphtheria-acellular pertussis (Tdap) vaccine in the Vaccine Safety Datalink.
Presumptive cases of local reactions were identified among members ≥ 11 years of age using ICD-9-CM codes in all care settings in the 1-6 days following a Tdap vaccination between 2012 and 2014. The clinical notes were searched for signs and symptoms consistent with local reaction. Information on the timing and the location of a sign or symptom was also extracted to help determine whether or not the sign or symptom was vaccine related. Reactions triggered by causes other than Tdap vaccination were excluded. The NLP algorithm was developed at the lead study site and validated on a stratified random sample of 500 patients from five institutions.
The NLP algorithm achieved an overall weighted sensitivity of 87.9%, specificity of 92.8%, positive predictive value of 82.7%, and negative predictive value of 95.1%. In addition, using data at one site, the NLP algorithm identified 3326 potential Tdap-related local reactions that were not identified through diagnosis codes.
The NLP algorithm achieved high accuracy, and demonstrated the potential of NLP to reduce the efforts of manual chart review in vaccine safety studies.
局部反应是最常见的与疫苗相关的不良反应。由于疫苗接种导致的局部反应没有特定的诊断代码。以前的疫苗安全性研究使用非特定的诊断代码来识别潜在的局部反应病例,并通过人工病历审查来确认这些病例。在这项研究中,开发了一种自然语言处理(NLP)算法,以在疫苗安全数据链接中识别与破伤风、白喉、无细胞百日咳(Tdap)疫苗相关的局部反应。
在 2012 年至 2014 年期间,在所有医疗环境中,使用 ICD-9-CM 代码,在 Tdap 接种后 1-6 天内,在所有医疗环境中,对年龄≥11 岁的患者识别疑似局部反应病例。搜索临床记录,以寻找与局部反应一致的体征和症状。还提取了体征或症状出现的时间和位置的信息,以帮助确定体征或症状是否与疫苗有关。排除因 Tdap 接种以外的原因引起的反应。该 NLP 算法在牵头研究机构开发,并在来自五个机构的 500 名患者的分层随机样本上进行验证。
NLP 算法的总体加权灵敏度为 87.9%,特异性为 92.8%,阳性预测值为 82.7%,阴性预测值为 95.1%。此外,使用一个站点的数据,NLP 算法识别出 3326 例未通过诊断代码识别的潜在 Tdap 相关局部反应。
NLP 算法具有较高的准确性,并证明了 NLP 在疫苗安全性研究中减少人工病历审查工作的潜力。