Suppr超能文献

领域-蛋白质功能预测:使用功能感知领域嵌入表示进行蛋白质功能预测。

Domain-PFP: Protein Function Prediction Using Function-Aware Domain Embedding Representations.

作者信息

Ibtehaz Nabil, Kagaya Yuki, Kihara Daisuke

机构信息

Department of Computer Science, Purdue University, West Lafayette, IN, United States.

Department of Biological Sciences, Purdue University, West Lafayette, IN, United States.

出版信息

bioRxiv. 2023 Aug 24:2023.08.23.554486. doi: 10.1101/2023.08.23.554486.

Abstract

Domains are functional and structural units of proteins that govern various biological functions performed by the proteins. Therefore, the characterization of domains in a protein can serve as a proper functional representation of proteins. Here, we employ a self-supervised protocol to derive functionally consistent representations for domains by learning domain-Gene Ontology (GO) co-occurrences and associations. The domain embeddings we constructed turned out to be effective in performing actual function prediction tasks. Extensive evaluations showed that protein representations using the domain embeddings are superior to those of large-scale protein language models in GO prediction tasks. Moreover, the new function prediction method built on the domain embeddings, named Domain-PFP, significantly outperformed the state-of-the-art function predictors. Additionally, Domain-PFP demonstrated competitive performance in the CAFA3 evaluation, achieving overall the best performance among the top teams that participated in the assessment.

摘要

结构域是蛋白质的功能和结构单元,它们控制着蛋白质执行的各种生物学功能。因此,蛋白质中结构域的表征可以作为蛋白质适当的功能表示。在这里,我们采用一种自监督协议,通过学习结构域与基因本体(GO)的共现和关联来推导功能一致的结构域表示。我们构建的结构域嵌入在执行实际功能预测任务中被证明是有效的。广泛的评估表明,在GO预测任务中,使用结构域嵌入的蛋白质表示优于大规模蛋白质语言模型。此外,基于结构域嵌入构建的新功能预测方法Domain-PFP显著优于当前最先进的功能预测器。此外,Domain-PFP在CAFA3评估中表现出有竞争力的性能,在参与评估的顶级团队中总体上取得了最佳性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c927/10473699/640790d42ad8/nihpp-2023.08.23.554486v1-f0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验