通过结合文献挖掘和多种交互证据来源构建的小鼠蛋白质相互作用组。

A mouse protein interactome through combined literature mining with multiple sources of interaction evidence.

机构信息

Sichuan Key Laboratory of Molecular Biology and Biotechnology, Ministry of Education Key Laboratory for Bio-resource and Eco-environment, College of Life Sciences, Sichuan University, 610065, Chengdu, People's Republic of China.

出版信息

Amino Acids. 2010 Apr;38(4):1237-52. doi: 10.1007/s00726-009-0335-7. Epub 2009 Aug 8.

Abstract

Protein-protein interactions (PPIs) play crucial roles in a number of biological processes. Recently, protein interaction networks (PINs) for several model organisms and humans have been generated, but few large-scale researches for mice have ever been made neither experimentally nor computationally. In the work, we undertook an effort to map a mouse PIN, in which protein interactions are hidden in enormous amount of biomedical literatures. Following a co-occurrence-based text-mining approach, a probabilistic model--naïve Bayesian was used to filter false-positive interactions by integrating heterogeneous kinds of evidence from genomic and proteomic datasets. A support vector machine algorithm was further used to choose protein pairs with physical interactions. By comparing with the currently available PPI datasets from several model organisms and humans, it showed that the derived mouse PINs have similar topological properties at the global level, but a high local divergence. The mouse protein interaction dataset is stored in the Mouse protein-protein interaction DataBase (MppDB) that is useful source of information for system-level understanding of gene function and biological processes in mammals. Access to the MppDB database is public available at http://bio.scu.edu.cn/mppi.

摘要

蛋白质-蛋白质相互作用(PPIs)在许多生物过程中起着至关重要的作用。最近,已经生成了几种模型生物和人类的蛋白质相互作用网络(PINs),但很少有大规模的实验或计算研究针对老鼠进行。在这项工作中,我们努力绘制了一个老鼠 PIN,其中蛋白质相互作用隐藏在大量的生物医学文献中。通过基于共现的文本挖掘方法,使用朴素贝叶斯概率模型整合来自基因组和蛋白质组数据集的各种异质证据,过滤虚假阳性相互作用。进一步使用支持向量机算法选择具有物理相互作用的蛋白质对。通过与来自几种模型生物和人类的现有 PPI 数据集进行比较,结果表明,所得到的老鼠 PIN 在全局水平上具有相似的拓扑性质,但局部高度发散。老鼠蛋白质相互作用数据集存储在老鼠蛋白质-蛋白质相互作用数据库(MppDB)中,这是理解哺乳动物系统水平基因功能和生物过程的有用信息来源。可以通过访问 http://bio.scu.edu.cn/mppi 公开获取 MppDB 数据库。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索