Suppr超能文献

利用叙事文本进行伤害监测研究:系统评价。

The use of narrative text for injury surveillance research: a systematic review.

机构信息

National Centre for Health Information Research and Training, Queensland University of Technology, Brisbane, Queensland, Australia.

出版信息

Accid Anal Prev. 2010 Mar;42(2):354-63. doi: 10.1016/j.aap.2009.09.020. Epub 2009 Oct 24.

Abstract

OBJECTIVE

To summarise the extent to which narrative text fields in administrative health data are used to gather information about the event resulting in presentation to a health care provider for treatment of an injury, and to highlight best practise approaches to conducting narrative text interrogation for injury surveillance purposes.

DESIGN

Systematic review.

DATA SOURCES

Electronic databases searched included CINAHL, Google Scholar, Medline, Proquest, PubMed and PubMed Central. Snowballing strategies were employed by searching the bibliographies of retrieved references to identify relevant associated articles.

SELECTION CRITERIA

Papers were selected if the study used a health-related database and if the study objectives were to a) use text field to identify injury cases or use text fields to extract additional information on injury circumstances not available from coded data or b) use text fields to assess accuracy of coded data fields for injury-related cases or c) describe methods/approaches for extracting injury information from text fields.

METHODS

The papers identified through the search were independently screened by two authors for inclusion, resulting in 41 papers selected for review. Due to heterogeneity between studies meta-analysis was not performed.

RESULTS

The majority of papers reviewed focused on describing injury epidemiology trends using coded data and text fields to supplement coded data (28 papers), with these studies demonstrating the value of text data for providing more specific information beyond what had been coded to enable case selection or provide circumstantial information. Caveats were expressed in terms of the consistency and completeness of recording of text information resulting in underestimates when using these data. Four coding validation papers were reviewed with these studies showing the utility of text data for validating and checking the accuracy of coded data. Seven studies (9 papers) described methods for interrogating injury text fields for systematic extraction of information, with a combination of manual and semi-automated methods used to refine and develop algorithms for extraction and classification of coded data from text. Quality assurance approaches to assessing the robustness of the methods for extracting text data was only discussed in 8 of the epidemiology papers, and 1 of the coding validation papers. All of the text interrogation methodology papers described systematic approaches to ensuring the quality of the approach.

CONCLUSIONS

Manual review and coding approaches, text search methods, and statistical tools have been utilised to extract data from narrative text and translate it into useable, detailed injury event information. These techniques can and have been applied to administrative datasets to identify specific injury types and add value to previously coded injury datasets. Only a few studies thoroughly described the methods which were used for text mining and less than half of the studies which were reviewed used/described quality assurance methods for ensuring the robustness of the approach. New techniques utilising semi-automated computerised approaches and Bayesian/clustering statistical methods offer the potential to further develop and standardise the analysis of narrative text for injury surveillance.

摘要

目的

总结行政健康数据中的叙述性文本字段在收集因受伤而就诊的治疗事件信息方面的使用程度,并强调用于伤害监测的叙述性文本查询的最佳实践方法。

设计

系统评价。

数据来源

电子数据库包括 CINAHL、Google Scholar、Medline、Proquest、PubMed 和 PubMed Central。通过检索检索到的参考文献的书目,采用滚雪球策略来确定相关的关联文章。

选择标准

如果研究使用了与健康相关的数据库,并且研究目标是 a) 使用文本字段识别伤害病例,或使用文本字段提取编码数据中不可用的伤害情况的附加信息,或 b) 使用文本字段评估编码数据字段对伤害相关病例的准确性,或 c) 描述从文本字段中提取伤害信息的方法/方法,则选择论文。

方法

通过搜索确定的论文由两位作者独立筛选是否包含,最终有 41 篇论文被选中进行审查。由于研究之间存在异质性,因此未进行荟萃分析。

结果

大多数综述文章侧重于使用编码数据和文本字段描述伤害流行病学趋势,以补充编码数据(28 篇文章),这些研究表明文本数据对于提供超出编码范围的更具体信息非常有价值,从而能够选择病例或提供情况信息。在使用这些数据时,记录文本信息的一致性和完整性存在缺陷,导致低估。综述了 4 篇编码验证论文,这些研究表明文本数据对于验证和检查编码数据的准确性非常有用。有 7 篇研究(9 篇论文)描述了用于系统地从伤害文本字段中查询信息的方法,使用手动和半自动方法相结合,以细化和开发从文本中提取和分类编码数据的算法。只有 8 篇流行病学论文和 1 篇编码验证论文讨论了评估提取文本数据方法的稳健性的质量保证方法。所有文本查询方法论文都描述了确保方法质量的系统方法。

结论

手动审查和编码方法、文本搜索方法和统计工具已被用于从叙述性文本中提取数据,并将其转化为可用的详细伤害事件信息。这些技术可以并且已经应用于行政数据集,以识别特定的伤害类型,并为以前编码的伤害数据集增加价值。只有少数研究详细描述了用于文本挖掘的方法,并且在进行综述的研究中,不到一半的研究使用/描述了用于确保方法稳健性的质量保证方法。利用半自动计算机化方法和贝叶斯/聚类统计方法的新技术有可能进一步开发和规范伤害监测的叙述性文本分析。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验