Suppr超能文献

使用语境分析来刻画生物医学文献中的明显矛盾。

Towards a characterization of apparent contradictions in the biomedical literature using context analysis.

机构信息

National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA.

出版信息

J Biomed Inform. 2019 Oct;98:103275. doi: 10.1016/j.jbi.2019.103275. Epub 2019 Aug 29.

Abstract

BACKGROUND

With the substantial growth in the biomedical research literature, a larger number of claims are published daily, some of which seemingly disagree with or contradict prior claims on the same topics. Resolving such contradictions is critical to advancing our understanding of human disease and developing effective treatments. Automated text analysis techniques can facilitate such analysis by extracting claims from the literature, flagging those that are potentially contradictory, and identifying any study characteristics that may explain such contradictions.

METHODS

Using SemMedDB, our own PubMed-scale repository of semantic predications (subject-relation-object triples), we identified apparent contradictions in the biomedical research literature and developed a categorization of contextual characteristics that explain such contradictions. Clinically relevant semantic predications relating to 20 diseases and involving opposing predicate pairs (e.g., an intervention treats or causes a disease) were retrieved from SemMedDB. After addressing inference, uncertainty, generic concepts, and NLP errors through automatic and manual filtering steps, a set of apparent contradictions were identified and characterized.

RESULTS

We retrieved 117,676 predication instances from 62,360 PubMed abstracts (Jan 1980-Dec 2016). From these instances, automatic filtering steps generated 2236 candidate contradictory pairs. Through manual analysis, we determined that 58 of these pairs (2.6%) were apparent contradictions. We identified five main categories of contextual characteristics that explain these contradictions: (a) internal to the patient, (b) external to the patient, (c) endogenous/exogenous, (d) known controversy, and (e) contradictions in literature. Categories (a) and (b) were subcategorized further (e.g., species, dosage) and accounted for the bulk of the contradictory information.

CONCLUSIONS

Semantic predications, by accounting for lexical variability, and SemMedDB, owing to its literature scale, can support identification and elucidation of potentially contradictory claims across the biomedical domain. Further filtering and classification steps are needed to distinguish among them the true contradictory claims. The ability to detect contradictions automatically can facilitate important biomedical knowledge management tasks, such as tracking and verifying scientific claims, summarizing research on a given topic, identifying knowledge gaps, and assessing evidence for systematic reviews, with potential benefits to the scientific community. Future work will focus on automating these steps for fully automatic recognition of contradictions from the biomedical research literature.

摘要

背景

随着生物医学研究文献的大量增长,每天都会发表更多的研究结果,其中一些结果似乎与同一主题的先前结果不一致或矛盾。解决这些矛盾对于加深我们对人类疾病的理解和开发有效的治疗方法至关重要。自动化文本分析技术可以通过从文献中提取研究结果、标记那些可能有矛盾的研究结果,并识别可能解释这些矛盾的任何研究特征来促进这种分析。

方法

我们使用 SemMedDB,即我们自己的基于 PubMed 的语义断言(主题-关系-对象三元组)知识库,从生物医学研究文献中发现了明显的矛盾,并开发了一种分类方法,用于解释这些矛盾的上下文特征。从 SemMedDB 中检索与 20 种疾病相关的临床相关语义断言,涉及相反的断言对(例如,干预措施治疗或引起疾病)。通过自动和手动过滤步骤解决推理、不确定性、通用概念和自然语言处理错误后,确定并描述了一组明显的矛盾。

结果

我们从 62360 篇 PubMed 摘要(1980 年 1 月至 2016 年 12 月)中检索到 117676 个断言实例。通过自动过滤步骤,这些实例生成了 2236 对候选矛盾对。通过手动分析,我们确定其中 58 对(2.6%)是明显的矛盾。我们确定了五个主要的上下文特征类别,可以解释这些矛盾:(a)患者内部,(b)患者外部,(c)内源性/外源性,(d)已知争议,以及(e)文献中的矛盾。类别(a)和(b)进一步细分(例如,物种、剂量),并解释了大部分矛盾信息。

结论

语义断言通过考虑词汇的可变性,以及 SemMedDB 由于其文献规模,可以支持在整个生物医学领域中识别和阐明潜在的矛盾主张。需要进一步的过滤和分类步骤来区分其中真正的矛盾主张。自动检测矛盾的能力可以促进重要的生物医学知识管理任务,例如跟踪和验证科学主张、总结给定主题的研究、识别知识空白以及评估系统综述的证据,这对科学界有潜在的好处。未来的工作将集中于自动化这些步骤,以便从生物医学研究文献中自动识别矛盾。

相似文献

6
Enhancing the coverage of SemRep using a relation classification approach.利用关系分类方法增强 SemRep 的覆盖范围。
J Biomed Inform. 2024 Jul;155:104658. doi: 10.1016/j.jbi.2024.104658. Epub 2024 May 21.
8
Expanding vocabularies for complementary and alternative medicine therapies.扩展补充和替代医学疗法的词汇量。
Int J Med Inform. 2019 Jan;121:64-74. doi: 10.1016/j.ijmedinf.2018.11.009. Epub 2018 Nov 22.
9
Broad-coverage biomedical relation extraction with SemRep.基于 SemRep 的广谱生物医学关系抽取。
BMC Bioinformatics. 2020 May 14;21(1):188. doi: 10.1186/s12859-020-3517-7.

引用本文的文献

1
Heterogeneous network approaches to protein pathway prediction.用于蛋白质通路预测的异构网络方法。
Comput Struct Biotechnol J. 2024 Jun 27;23:2727-2739. doi: 10.1016/j.csbj.2024.06.022. eCollection 2024 Dec.
6
A Year of Papers Using Biomedical Texts.一年来使用生物医学文本的论文。
Yearb Med Inform. 2020 Aug;29(1):221-225. doi: 10.1055/s-0040-1701997. Epub 2020 Aug 21.
7
Broad-coverage biomedical relation extraction with SemRep.基于 SemRep 的广谱生物医学关系抽取。
BMC Bioinformatics. 2020 May 14;21(1):188. doi: 10.1186/s12859-020-3517-7.

本文引用的文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验