Suppr超能文献

基于本体的大肠杆菌疫苗相关基因相互作用网络的文献挖掘

Ontology-based literature mining of E. coli vaccine-associated gene interaction networks.

作者信息

Hur Junguk, Özgür Arzucan, He Yongqun

机构信息

Department of Biomedical Sciences, University of North Dakota School of Medicine and Health Sciences, Grand Forks, ND, 58202, USA.

Department of Computer Engineering, Bogazici University, Istanbul, 34342, Turkey.

出版信息

J Biomed Semantics. 2017 Mar 14;8(1):12. doi: 10.1186/s13326-017-0122-4.

Abstract

BACKGROUND

Pathogenic Escherichia coli infections cause various diseases in humans and many animal species. However, with extensive E. coli vaccine research, we are still unable to fully protect ourselves against E. coli infections. To more rational development of effective and safe E. coli vaccine, it is important to better understand E. coli vaccine-associated gene interaction networks.

METHODS

In this study, we first extended the Vaccine Ontology (VO) to semantically represent various E. coli vaccines and genes used in the vaccine development. We also normalized E. coli gene names compiled from the annotations of various E. coli strains using a pan-genome-based annotation strategy. The Interaction Network Ontology (INO) includes a hierarchy of various interaction-related keywords useful for literature mining. Using VO, INO, and normalized E. coli gene names, we applied an ontology-based SciMiner literature mining strategy to mine all PubMed abstracts and retrieve E. coli vaccine-associated E. coli gene interactions. Four centrality metrics (i.e., degree, eigenvector, closeness, and betweenness) were calculated for identifying highly ranked genes and interaction types.

RESULTS

Using vaccine-related PubMed abstracts, our study identified 11,350 sentences that contain 88 unique INO interactions types and 1,781 unique E. coli genes. Each sentence contained at least one interaction type and two unique E. coli genes. An E. coli gene interaction network of genes and INO interaction types was created. From this big network, a sub-network consisting of 5 E. coli vaccine genes, including carA, carB, fimH, fepA, and vat, and 62 other E. coli genes, and 25 INO interaction types was identified. While many interaction types represent direct interactions between two indicated genes, our study has also shown that many of these retrieved interaction types are indirect in that the two genes participated in the specified interaction process in a required but indirect process. Our centrality analysis of these gene interaction networks identified top ranked E. coli genes and 6 INO interaction types (e.g., regulation and gene expression).

CONCLUSIONS

Vaccine-related E. coli gene-gene interaction network was constructed using ontology-based literature mining strategy, which identified important E. coli vaccine genes and their interactions with other genes through specific interaction types.

摘要

背景

致病性大肠杆菌感染可导致人类和许多动物物种患上各种疾病。然而,尽管对大肠杆菌疫苗进行了广泛研究,我们仍无法完全预防大肠杆菌感染。为了更合理地开发有效且安全的大肠杆菌疫苗,更好地理解与大肠杆菌疫苗相关的基因相互作用网络非常重要。

方法

在本研究中,我们首先扩展了疫苗本体(VO),以便从语义上表示各种大肠杆菌疫苗以及疫苗开发中使用的基因。我们还使用基于泛基因组的注释策略对从各种大肠杆菌菌株注释中汇编的大肠杆菌基因名称进行了标准化。相互作用网络本体(INO)包括用于文献挖掘的各种与相互作用相关的关键词层次结构。利用VO、INO和标准化的大肠杆菌基因名称,我们应用基于本体的SciMiner文献挖掘策略来挖掘所有PubMed摘要,并检索与大肠杆菌疫苗相关的大肠杆菌基因相互作用。计算了四个中心性指标(即度、特征向量、紧密性和中介性),以识别排名靠前的基因和相互作用类型。

结果

利用与疫苗相关的PubMed摘要,我们的研究识别出11350个句子,其中包含88种独特的INO相互作用类型和1781个独特的大肠杆菌基因。每个句子至少包含一种相互作用类型和两个独特的大肠杆菌基因。构建了一个由基因和INO相互作用类型组成的大肠杆菌基因相互作用网络。从这个大网络中,识别出一个由5个大肠杆菌疫苗基因(包括carA、carB、fimH、fepA和vat)、62个其他大肠杆菌基因以及25种INO相互作用类型组成的子网。虽然许多相互作用类型代表两个指定基因之间的直接相互作用,但我们的研究还表明,这些检索到的相互作用类型中有许多是间接的,因为这两个基因在一个必要但间接的过程中参与了特定的相互作用过程。我们对这些基因相互作用网络的中心性分析确定了排名靠前的大肠杆菌基因和6种INO相互作用类型(如调控和基因表达)。

结论

利用基于本体的文献挖掘策略构建了与疫苗相关的大肠杆菌基因-基因相互作用网络,该网络通过特定的相互作用类型识别了重要的大肠杆菌疫苗基因及其与其他基因的相互作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45d7/5348867/e09d33542c23/13326_2017_122_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验