Suppr超能文献

ContactPFP:利用预测的接触信息进行蛋白质功能预测。

ContactPFP: Protein function prediction using predicted contact information.

作者信息

Kagaya Yuki, Flannery Sean T, Jain Aashish, Kihara Daisuke

机构信息

Department of Biological Sciences, Purdue University, West Lafayette, IN, US.

Department of Computer Science, Purdue University, West Lafayette, IN, US.

出版信息

Front Bioinform. 2022 Jun;2. doi: 10.3389/fbinf.2022.896295. Epub 2022 Jun 2.

Abstract

Computational function prediction is one of the most important problems in bioinformatics as elucidating the function of genes is a central task in molecular biology and genomics. Most of the existing function prediction methods use protein sequences as the primary source of input information because the sequence is the most available information for query proteins. There are attempts to consider other attributes of query proteins. Among these attributes, the three-dimensional (3D) structure of proteins is known to be very useful in identifying the evolutionary relationship of proteins, from which functional similarity can be inferred. Here, we report a novel protein function prediction method, ContactPFP, which uses predicted residue-residue contact maps as input structural features of query proteins. Although 3D structure information is known to be useful, it has not been routinely used in function prediction because the 3D structure is not experimentally determined for many proteins. In ContactPFP, we overcome this limitation by using residue-residue contact prediction, which has become increasingly accurate due to rapid development in the protein structure prediction field. ContactPFP takes a query protein sequence as input and uses predicted residue-residue contact as a proxy for the 3D protein structure. To characterize how predicted contacts contribute to function prediction accuracy, we compared the performance of ContactPFP with several well-established sequence-based function prediction methods. The comparative study revealed the advantages and weaknesses of ContactPFP compared to contemporary sequence-based methods. There were many cases where it showed higher prediction accuracy. We examined factors that affected the accuracy of ContactPFP using several illustrative cases that highlight the strength of our method.

摘要

计算功能预测是生物信息学中最重要的问题之一,因为阐明基因功能是分子生物学和基因组学的核心任务。大多数现有的功能预测方法将蛋白质序列作为主要的输入信息来源,因为序列是查询蛋白质最容易获得的信息。也有人尝试考虑查询蛋白质的其他属性。在这些属性中,蛋白质的三维(3D)结构在识别蛋白质的进化关系方面非常有用,从中可以推断出功能相似性。在这里,我们报告了一种新的蛋白质功能预测方法ContactPFP,它使用预测的残基-残基接触图作为查询蛋白质的输入结构特征。虽然已知3D结构信息很有用,但它尚未在功能预测中常规使用,因为许多蛋白质的3D结构不是通过实验确定的。在ContactPFP中,我们通过使用残基-残基接触预测克服了这一限制,由于蛋白质结构预测领域的快速发展,残基-残基接触预测变得越来越准确。ContactPFP以查询蛋白质序列作为输入,并使用预测的残基-残基接触作为3D蛋白质结构的替代。为了表征预测的接触如何有助于功能预测准确性,我们将ContactPFP的性能与几种成熟的基于序列的功能预测方法进行了比较。比较研究揭示了ContactPFP与当代基于序列的方法相比的优缺点。在许多情况下,它显示出更高的预测准确性。我们使用几个说明性案例来突出我们方法的优势,研究了影响ContactPFP准确性的因素。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ee5/9580906/9aab0a5d456c/fbinf-02-896295-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验