Suppr超能文献

朝向蛋白功能的区域特异性传播。

Towards region-specific propagation of protein functions.

机构信息

Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY, USA.

Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA.

出版信息

Bioinformatics. 2019 May 15;35(10):1737-1744. doi: 10.1093/bioinformatics/bty834.

Abstract

MOTIVATION

Due to the nature of experimental annotation, most protein function prediction methods operate at the protein-level, where functions are assigned to full-length proteins based on overall similarities. However, most proteins function by interacting with other proteins or molecules, and many functional associations should be limited to specific regions rather than the entire protein length. Most domain-centric function prediction methods depend on accurate domain family assignments to infer relationships between domains and functions, with regions that are unassigned to a known domain-family left out of functional evaluation. Given the abundance of residue-level annotations currently available, we present a function prediction methodology that automatically infers function labels of specific protein regions using protein-level annotations and multiple types of region-specific features.

RESULTS

We apply this method to local features obtained from InterPro, UniProtKB and amino acid sequences and show that this method improves both the accuracy and region-specificity of protein function transfer and prediction. We compare region-level predictive performance of our method against that of a whole-protein baseline method using proteins with structurally verified binding sites and also compare protein-level temporal holdout predictive performances to expand the variety and specificity of GO terms we could evaluate. Our results can also serve as a starting point to categorize GO terms into region-specific and whole-protein terms and select prediction methods for different classes of GO terms.

AVAILABILITY AND IMPLEMENTATION

The code and features are freely available at: https://github.com/ek1203/rsfp.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

由于实验注释的性质,大多数蛋白质功能预测方法都在蛋白质水平上进行操作,即根据整体相似性将功能分配给全长蛋白质。然而,大多数蛋白质通过与其他蛋白质或分子相互作用而发挥功能,并且许多功能关联应该仅限于特定区域,而不是整个蛋白质长度。大多数基于结构域的功能预测方法都依赖于准确的结构域家族分配来推断结构域和功能之间的关系,而那些未分配给已知结构域家族的区域则被排除在功能评估之外。鉴于目前可用的残基级注释的丰富性,我们提出了一种功能预测方法,该方法使用蛋白质水平的注释和多种类型的区域特定特征自动推断特定蛋白质区域的功能标签。

结果

我们将这种方法应用于从 InterPro、UniProtKB 和氨基酸序列中获得的局部特征,并表明该方法提高了蛋白质功能转移和预测的准确性和区域特异性。我们使用具有结构验证结合位点的蛋白质比较了区域级预测性能,比较了蛋白质水平的时间保留预测性能,以扩展我们可以评估的 GO 术语的多样性和特异性。我们的结果还可以作为将 GO 术语分类为区域特定和全蛋白质术语的起点,并为不同类别的 GO 术语选择预测方法。

可用性和实现

代码和特征可在以下网址免费获得:https://github.com/ek1203/rsfp。

补充信息

补充数据可在生物信息学在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fac0/6513163/fb8830352912/bty834f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验