Graves Janessa M, Whitehill Jennifer M, Hagel Brent E, Rivara Frederick P
College of Nursing, Washington State University, Spokane, WA, USA; Harborview Injury Prevention and Research Center (HIPRC), University of Washington, Seattle, WA, USA.
Department of Health Policy and Management, School of Public Health and Health Sciences, University of Massachusetts Amherst, Amherst, MA, USA.
Injury. 2015 May;46(5):891-7. doi: 10.1016/j.injury.2014.11.012. Epub 2014 Nov 26.
Free-text fields in injury surveillance databases can provide detailed information beyond routinely coded data. Additional data, such as exposures and covariates can be identified from narrative text and used to conduct case-control studies.
To illustrate this, we developed a text-search algorithm to identify helmet status (worn, not worn, use unknown) in the U.S. National Electronic Injury Surveillance System (NEISS) narratives for bicycling and other sports injuries from 2005 to 2011. We calculated adjusted odds ratios (ORs) for head injury associated with helmet use, with non-head injuries representing controls. For bicycling, we validated ORs against published estimates. ORs were calculated for other sports and we examined factors associated with helmet reporting.
Of 105,614 bicycling injury narratives reviewed, 14.1% contained sufficient helmet information for use in the case-control study. The adjusted ORs for head injuries associated with helmet-wearing were smaller than, but directionally consistent, with previously published estimates (e.g., 1999 Cochrane Review). ORs illustrated a protective effect of helmets for other sports as well (less than 1).
This exploratory analysis illustrates the potential utility of relatively simple text-search algorithms to identify additional variables in surveillance data. Limitations of this study include possible selection bias and the inability to identify individuals with multiple injuries. A similar approach can be applied to study other injuries, conditions, risks, or protective factors. This approach may serve as an efficient method to extend the utility of injury surveillance data to conduct epidemiological research.
伤害监测数据库中的自由文本字段可以提供常规编码数据之外的详细信息。可以从叙述性文本中识别出其他数据,如暴露因素和协变量,并用于开展病例对照研究。
为了说明这一点,我们开发了一种文本搜索算法,以识别2005年至2011年美国国家电子伤害监测系统(NEISS)中关于自行车骑行和其他运动伤害的叙述文本中的头盔使用情况(佩戴、未佩戴、使用情况未知)。我们计算了与头盔使用相关的头部伤害的调整比值比(OR),以非头部伤害作为对照。对于自行车骑行伤害,我们将OR与已发表的估计值进行了验证。我们计算了其他运动的OR,并研究了与头盔报告相关的因素。
在审查的105,614篇自行车骑行伤害叙述文本中,14.1%包含足够的头盔信息可用于病例对照研究。与佩戴头盔相关的头部伤害的调整OR小于先前发表的估计值(例如1999年Cochrane综述),但方向一致。OR也显示出头盔对其他运动也有保护作用(小于1)。
这项探索性分析说明了相对简单的文本搜索算法在识别监测数据中其他变量方面的潜在效用。本研究的局限性包括可能存在的选择偏倚以及无法识别有多处伤害的个体。类似的方法可应用于研究其他伤害、疾病、风险或保护因素。这种方法可能是一种有效的方式,可扩展伤害监测数据的效用以开展流行病学研究