Suppr超能文献

多层序列网络分析提高蛋白质 3D 结构分类。

Multi-layer sequential network analysis improves protein 3D structural classification.

机构信息

Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, Indiana, USA.

Center for Data and Computing in Natural Sciences (CDCS), Institute for Computational Systems Biology, Universität Hamburg, Hamburg, Germany.

出版信息

Proteins. 2022 Sep;90(9):1721-1731. doi: 10.1002/prot.26349. Epub 2022 May 2.

Abstract

Protein structural classification (PSC) is a supervised problem of assigning proteins into pre-defined structural (e.g., CATH or SCOPe) classes based on the proteins' sequence or 3D structural features. We recently proposed PSC approaches that model protein 3D structures as protein structure networks (PSNs) and analyze PSN-based protein features, which performed better than or comparable to state-of-the-art sequence or other 3D structure-based PSC approaches. However, existing PSN-based PSC approaches model the whole 3D structure of a protein as a static (i.e., single-layer) PSN. Because folding of a protein is a dynamic process, where some parts (i.e., sub-structures) of a protein fold before others, modeling the 3D structure of a protein as a PSN that captures the sub-structures might further help improve the existing PSC performance. Here, we propose to model 3D structures of proteins as multi-layer sequential PSNs that approximate 3D sub-structures of proteins, with the hypothesis that this will improve upon the current state-of-the-art PSC approaches that are based on single-layer PSNs (and thus upon the existing state-of-the-art sequence and other 3D structural approaches). Indeed, we confirm this on 72 datasets spanning ~44 000 CATH and SCOPe protein domains.

摘要

蛋白质结构分类(PSC)是一个监督问题,根据蛋白质的序列或 3D 结构特征,将蛋白质分配到预先定义的结构(例如 CATH 或 SCOPe)类别中。我们最近提出了一些 PSC 方法,这些方法将蛋白质 3D 结构建模为蛋白质结构网络(PSN),并分析基于 PSN 的蛋白质特征,这些方法的性能优于或可与最新的序列或其他基于 3D 结构的 PSC 方法相媲美。然而,现有的基于 PSN 的 PSC 方法将蛋白质的整个 3D 结构建模为静态(即单层)PSN。由于蛋白质的折叠是一个动态的过程,其中蛋白质的一些部分(即亚结构)先折叠,因此将蛋白质的 3D 结构建模为捕获亚结构的 PSN 可能会进一步提高现有 PSC 的性能。在这里,我们提出将蛋白质的 3D 结构建模为多层顺序 PSN,这些 PSN 近似于蛋白质的 3D 亚结构,假设这将改进基于单层 PSN 的最新 PSC 方法(因此也改进了现有的基于序列和其他 3D 结构的方法)。实际上,我们在跨越约 44000 个 CATH 和 SCOPe 蛋白质结构域的 72 个数据集上验证了这一点。

相似文献

1
Multi-layer sequential network analysis improves protein 3D structural classification.
Proteins. 2022 Sep;90(9):1721-1731. doi: 10.1002/prot.26349. Epub 2022 May 2.
2
Network-based protein structural classification.
R Soc Open Sci. 2020 Jun 3;7(6):191461. doi: 10.1098/rsos.191461. eCollection 2020 Jun.
4
5
SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition.
BMC Bioinformatics. 2007 May 22;8 Suppl 4(Suppl 4):S2. doi: 10.1186/1471-2105-8-S4-S2.
6
Structural Class Classification of 3D Protein Structure Based on Multi-View 2D Images.
IEEE/ACM Trans Comput Biol Bioinform. 2018 Jan-Feb;15(1):286-299. doi: 10.1109/TCBB.2016.2603987. Epub 2016 Aug 29.
7
New functional families (FunFams) in CATH to improve the mapping of conserved functional sites to 3D structures.
Nucleic Acids Res. 2013 Jan;41(Database issue):D490-8. doi: 10.1093/nar/gks1211. Epub 2012 Nov 29.
9
Toward high-throughput, multicriteria protein-structure comparison and analysis.
IEEE Trans Nanobioscience. 2010 Jun;9(2):144-55. doi: 10.1109/TNB.2010.2043851.

引用本文的文献

2
Transcription factor prediction using protein 3D secondary structures.
Bioinformatics. 2024 Dec 26;41(1). doi: 10.1093/bioinformatics/btae762.
3
Prognostic importance of splicing-triggered aberrations of protein complex interfaces in cancer.
NAR Genom Bioinform. 2024 Sep 26;6(3):lqae133. doi: 10.1093/nargab/lqae133. eCollection 2024 Sep.
4
Current and future directions in network biology.
Bioinform Adv. 2024 Aug 14;4(1):vbae099. doi: 10.1093/bioadv/vbae099. eCollection 2024.

本文引用的文献

1
Structure-based protein function prediction using graph convolutional networks.
Nat Commun. 2021 May 26;12(1):3168. doi: 10.1038/s41467-021-23303-9.
2
Effect of Protein Structure on Evolution of Cotranslational Folding.
Biophys J. 2020 Sep 15;119(6):1123-1134. doi: 10.1016/j.bpj.2020.06.037. Epub 2020 Aug 12.
3
Network-based protein structural classification.
R Soc Open Sci. 2020 Jun 3;7(6):191461. doi: 10.1098/rsos.191461. eCollection 2020 Jun.
4
Network analysis of synonymous codon usage.
Bioinformatics. 2020 Dec 8;36(19):4876-4884. doi: 10.1093/bioinformatics/btaa603.
5
Synonymous codon substitutions perturb cotranslational protein folding in vivo and impair cell fitness.
Proc Natl Acad Sci U S A. 2020 Feb 18;117(7):3528-3534. doi: 10.1073/pnas.1907126117. Epub 2020 Feb 3.
6
Temporal network alignment via GoT-WAVE.
Bioinformatics. 2019 Sep 15;35(18):3527-3529. doi: 10.1093/bioinformatics/btz119.
8
Graphlet-orbit Transitions (GoT): A fingerprint for temporal network comparison.
PLoS One. 2018 Oct 18;13(10):e0205497. doi: 10.1371/journal.pone.0205497. eCollection 2018.
9
Unraveling co-translational protein folding: Concepts and methods.
Methods. 2018 Mar 15;137:71-81. doi: 10.1016/j.ymeth.2017.11.007. Epub 2017 Dec 6.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验