Suppr超能文献

通过对比图神经网络的有效表示学习进行快速蛋白质结构比较。

Fast protein structure comparison through effective representation learning with contrastive graph neural networks.

机构信息

Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, China.

出版信息

PLoS Comput Biol. 2022 Mar 24;18(3):e1009986. doi: 10.1371/journal.pcbi.1009986. eCollection 2022 Mar.

Abstract

Protein structure alignment algorithms are often time-consuming, resulting in challenges for large-scale protein structure similarity-based retrieval. There is an urgent need for more efficient structure comparison approaches as the number of protein structures increases rapidly. In this paper, we propose an effective graph-based protein structure representation learning method, GraSR, for fast and accurate structure comparison. In GraSR, a graph is constructed based on the intra-residue distance derived from the tertiary structure. Then, deep graph neural networks (GNNs) with a short-cut connection learn graph representations of the tertiary structures under a contrastive learning framework. To further improve GraSR, a novel dynamic training data partition strategy and length-scaling cosine distance are introduced. We objectively evaluate our method GraSR on SCOPe v2.07 and a new released independent test set from PDB database with a designed comprehensive performance metric. Compared with other state-of-the-art methods, GraSR achieves about 7%-10% improvement on two benchmark datasets. GraSR is also much faster than alignment-based methods. We dig into the model and observe that the superiority of GraSR is mainly brought by the learned discriminative residue-level and global descriptors. The web-server and source code of GraSR are freely available at www.csbio.sjtu.edu.cn/bioinf/GraSR/ for academic use.

摘要

蛋白质结构比对算法通常很耗时,这给基于大规模蛋白质结构相似性的检索带来了挑战。随着蛋白质结构数量的快速增加,我们迫切需要更有效的结构比较方法。在本文中,我们提出了一种有效的基于图的蛋白质结构表示学习方法 GraSR,用于快速准确的结构比较。在 GraSR 中,根据来自三级结构的残基内距离构建一个图。然后,带有短路连接的深度图神经网络 (GNN) 在对比学习框架下学习三级结构的图表示。为了进一步提高 GraSR,我们引入了一种新的动态训练数据分区策略和长度缩放余弦距离。我们使用设计的综合性能指标,在 SCOPe v2.07 和来自 PDB 数据库的新发布的独立测试集上客观评估了我们的方法 GraSR。与其他最先进的方法相比,GraSR 在两个基准数据集上的性能提高了约 7%-10%。GraSR 也比基于对齐的方法快得多。我们深入研究了模型,并观察到 GraSR 的优势主要来自于学习到的有区分力的残基级和全局描述符。GraSR 的网络服务器和源代码可在 www.csbio.sjtu.edu.cn/bioinf/GraSR/ 上免费获取,供学术使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb4c/8982879/c51f23ce83b4/pcbi.1009986.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验