Piniewski-Bond J F, Buck G M, Horowitz R S, Schuster J H, Weed D L, Weiner J M
State University of New York at Buffalo, Buffalo, NY 14263, USA.
J Am Med Inform Assoc. 2001 Mar-Apr;8(2):174-84. doi: 10.1136/jamia.2001.0080174.
To examine the type of information obtainable from scientific papers, using three different methods for the extraction, organization, and preparation of literature reviews.
A set of three review papers was identified, and the ideas represented by the authors of those papers were extracted. The 161 articles referenced in those three reviews were then analyzed using 1) a formalized data extraction approach, which uses a protocol-driven manual process to extract the variables, values, and statistical significance of the stated relationships; and 2) a computerized approach known as "Idea Analysis," which uses the abstracts of the original articles and processes them through a computer software program that reads the abstracts and organizes the ideas presented by the authors. The results were then compared. The literature focused on the human papillomavirus and its relationship to cervical cancer.
Idea Analysis was able to identify 68.9 percent of the ideas considered by the authors of the three review papers to be of importance in describing the association between human papillomavirus and cervical cancer. The formalized data extraction identified 27 percent of the authors' ideas. The combination of the two approaches identified 74.3 percent of the ideas considered important in the relationship between human papillomavirus and cervical cancer, as reported by the authors of the three review articles.
This research demonstrated that both a technically derived and a computer derived collection, categorization, and summarization of original articles and abstracts could provide a reliable, valid, and reproducible source of ideas duplicating, to a major degree, the ideas presented by subject specialists in review articles. As such, these tools may be useful to experts preparing literature reviews by eliminating many of the clerical-mechanical features associated with present-day scientific text processing.
使用三种不同的文献综述提取、组织和准备方法,研究从科学论文中可获取的信息类型。
确定一组三篇综述论文,并提取这些论文作者所表达的观点。然后使用以下两种方法对这三篇综述中引用的161篇文章进行分析:1)一种形式化的数据提取方法,该方法使用协议驱动的人工流程来提取所述关系的变量、值和统计显著性;2)一种称为“思想分析”的计算机化方法,该方法使用原始文章的摘要,并通过计算机软件程序对其进行处理,该程序读取摘要并组织作者提出的观点。然后对结果进行比较。这些文献聚焦于人乳头瘤病毒及其与宫颈癌的关系。
思想分析能够识别出三篇综述论文的作者认为在描述人乳头瘤病毒与宫颈癌之间关联中重要的68.9%的观点。形式化数据提取识别出作者观点的27%。如三篇综述文章的作者所报告,两种方法相结合识别出了人乳头瘤病毒与宫颈癌关系中74.3%被认为重要的观点。
本研究表明,对原始文章和摘要进行技术推导和计算机推导的收集、分类和总结,在很大程度上可以提供可靠、有效且可重复的思想来源,与综述文章中主题专家提出的思想相重复。因此,这些工具对于准备文献综述的专家可能有用,因为它们消除了与当今科学文本处理相关的许多文书机械性工作。