通过大规模的多重环置换来描述蛋白质的现有和潜在结构空间。

Characterizing the existing and potential structural space of proteins by large-scale multiple loop permutations.

机构信息

School of Informatics, Indiana University Purdue University Indianapolis, and Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 719 Indiana Avenue, Walker Plaza Building Suite 319, Indianapolis, IN 46202, USA.

出版信息

J Mol Biol. 2011 May 6;408(3):585-95. doi: 10.1016/j.jmb.2011.02.056. Epub 2011 Mar 2.

DOI:10.1016/j.jmb.2011.02.056

PMID:21376059

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3075335/

Abstract

Worldwide structural genomics projects are increasing structure coverage of sequence space but have not significantly expanded the protein structure space itself (i.e., number of unique structural folds) since 2007. Discovering new structural folds experimentally by directed evolution and random recombination of secondary-structure blocks is also proved rarely successful. Meanwhile, previous computational efforts for large-scale mapping of protein structure space are limited to simple model proteins and led to an inconclusive answer on the completeness of the existing observed protein structure space. Here, we build novel protein structures by extending naturally occurring circular (single-loop) permutation to multiple loop permutations (MLPs). These structures are clustered by structural similarity measure called TM-score. The computational technique allows us to produce different structural clusters on the same naturally occurring, packed, stable core but with alternatively connected secondary-structure segments. A large-scale MLP of 2936 domains from structural classification of protein domains reproduces those existing structural clusters (63%) mostly as hubs for many nonredundant sequences and illustrates newly discovered novel clusters as islands adopted by a few sequences only. Results further show that there exist a significant number of novel potentially stable clusters for medium-size or large-size single-domain proteins, in particular, >100 amino acid residues, that are either not yet adopted by nature or adopted only by a few sequences. This study suggests that MLP provides a simple yet highly effective tool for engineering and design of novel protein structures (including naturally knotted proteins). The implication of recovering new-fold targets from critical assessment of structure prediction techniques (CASP) by MLP on template-based structure prediction is also discussed. Our MLP structures are available for download at the publication page of the Web site http://sparks.informatics.iupui.edu.

摘要

全球结构基因组学项目正在增加序列空间的结构覆盖率，但自 2007 年以来，并没有显著扩大蛋白质结构空间本身（即独特结构折叠的数量）。通过定向进化和二级结构块的随机重组来实验性地发现新的结构折叠也很少成功。与此同时，以前用于大规模映射蛋白质结构空间的计算工作仅限于简单的模型蛋白质，并且对于现有观察到的蛋白质结构空间的完整性得出了不确定的答案。在这里，我们通过将自然发生的圆形（单环）排列扩展到多个环排列（MLP）来构建新的蛋白质结构。这些结构通过结构相似性度量（称为 TM 分数）进行聚类。该计算技术允许我们在相同的自然发生、包装、稳定的核心上生成不同的结构簇，但具有不同连接的二级结构段。来自蛋白质结构分类的 2936 个结构域的大规模 MLP 再现了那些现有的结构簇（63%），主要作为许多非冗余序列的中心，并且说明了仅被少数序列采用的新发现的新颖簇。结果还表明，对于中等大小或大尺寸的单域蛋白质，存在大量潜在的新型稳定簇，特别是>100 个氨基酸残基，这些簇尚未被自然界采用，或者仅被少数序列采用。这项研究表明，MLP 为新型蛋白质结构（包括天然纽结蛋白）的工程和设计提供了一种简单而高效的工具。还讨论了 MLP 从结构预测技术的关键评估（CASP）中回收新折叠目标对基于模板的结构预测的影响。我们的 MLP 结构可在网站 http://sparks.informatics.iupui.edu 的出版物页面上下载。

相似文献

Characterizing the existing and potential structural space of proteins by large-scale multiple loop permutations.

J Mol Biol. 2011 May 6;408(3):585-95. doi: 10.1016/j.jmb.2011.02.056. Epub 2011 Mar 2.

High-resolution structure prediction of a circular permutation loop.

Protein Sci. 2011 Nov;20(11):1929-34. doi: 10.1002/pro.725. Epub 2011 Sep 30.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Improving protein fold recognition and template-based modeling by employing probabilistic-based matching between predicted one-dimensional structural properties of query and corresponding native properties of templates.

Bioinformatics. 2011 Aug 1;27(15):2076-82. doi: 10.1093/bioinformatics/btr350. Epub 2011 Jun 11.

Assembling novel protein folds from super-secondary structural fragments.

Proteins. 2003;53 Suppl 6:480-5. doi: 10.1002/prot.10542.

CLAP: a web-server for automatic classification of proteins with special reference to multi-domain proteins.

BMC Bioinformatics. 2014 Oct 4;15(1):343. doi: 10.1186/1471-2105-15-343.

Direct prediction of profiles of sequences compatible with a protein structure by neural networks with fragment-based local and energy-based nonlocal profiles.

Proteins. 2014 Oct;82(10):2565-73. doi: 10.1002/prot.24620. Epub 2014 Jun 19.

A galaxy of folds.

Protein Sci. 2010 Jan;19(1):124-30. doi: 10.1002/pro.297.

Systematic analysis of short internal indels and their impact on protein folding.

BMC Struct Biol. 2010 Aug 4;10:24. doi: 10.1186/1472-6807-10-24.

CASP 11 target classification.

Proteins. 2016 Sep;84 Suppl 1(Suppl 1):20-33. doi: 10.1002/prot.24982. Epub 2016 Jan 27.

引用本文的文献

Folds from fold: Exploring topological isoforms of a single-domain protein.

Proc Natl Acad Sci U S A. 2024 Oct 22;121(43):e2407355121. doi: 10.1073/pnas.2407355121. Epub 2024 Oct 15.

Deep generative models of protein structure uncover distant relationships across a continuous fold space.

Nat Commun. 2024 Sep 16;15(1):8094. doi: 10.1038/s41467-024-52020-2.

A review of visualisations of protein fold networks and their relationship with sequence and function.

Biol Rev Camb Philos Soc. 2023 Feb;98(1):243-262. doi: 10.1111/brv.12905. Epub 2022 Oct 9.

Trapping a Knot into Tight Conformations by Intra-Chain Repulsions.

Polymers (Basel). 2017 Feb 10;9(2):57. doi: 10.3390/polym9020057.

Rules for connectivity of secondary structure elements in protein: Two-layer αβ sandwiches.

Protein Sci. 2017 Nov;26(11):2257-2267. doi: 10.1002/pro.3285. Epub 2017 Sep 19.

Sixty-five years of the long march in protein secondary structure prediction: the final stretch?

Brief Bioinform. 2018 May 1;19(3):482-494. doi: 10.1093/bib/bbw129.

Protein rethreading: A novel approach to protein design.

Sci Rep. 2016 May 27;6:26847. doi: 10.1038/srep26847.

How a spatial arrangement of secondary structure elements is dispersed in the universe of protein folds.

PLoS One. 2014 Sep 22;9(9):e107959. doi: 10.1371/journal.pone.0107959. eCollection 2014.

Biophysics of protein evolution and evolutionary protein biophysics.

J R Soc Interface. 2014 Nov 6;11(100):20140419. doi: 10.1098/rsif.2014.0419.

Direct prediction of profiles of sequences compatible with a protein structure by neural networks with fragment-based local and energy-based nonlocal profiles.

Proteins. 2014 Oct;82(10):2565-73. doi: 10.1002/prot.24620. Epub 2014 Jun 19.

本文引用的文献

Improving computational protein design by using structure-derived sequence profile.

Proteins. 2010 Aug 1;78(10):2338-48. doi: 10.1002/prot.22746.

How significant is a protein structure similarity with TM-score = 0.5?

Bioinformatics. 2010 Apr 1;26(7):889-95. doi: 10.1093/bioinformatics/btq066. Epub 2010 Feb 17.

In vitro selection of GTP-binding proteins by block shuffling of estrogen-receptor fragments.

Biochem Biophys Res Commun. 2009 Dec 18;390(3):689-93. doi: 10.1016/j.bbrc.2009.10.029. Epub 2009 Oct 13.

The continuity of protein structure space is an intrinsic property of proteins.

Proc Natl Acad Sci U S A. 2009 Sep 15;106(37):15690-5. doi: 10.1073/pnas.0907683106. Epub 2009 Sep 1.

Critical assessment of methods of protein structure prediction - Round VIII.

Proteins. 2009;77 Suppl 9:1-4. doi: 10.1002/prot.22589.

Probing the "dark matter" of protein fold space.

Structure. 2009 Sep 9;17(9):1244-52. doi: 10.1016/j.str.2009.07.012.

Nature of the protein universe.

Proc Natl Acad Sci U S A. 2009 Jul 7;106(27):11079-84. doi: 10.1073/pnas.0905029106. Epub 2009 Jun 18.

Discrete-continuous duality of protein structure space.

Curr Opin Struct Biol. 2009 Jun;19(3):321-8. doi: 10.1016/j.sbi.2009.04.009. Epub 2009 May 29.

Pokefind: a novel topological filter for use with protein structure prediction.

Bioinformatics. 2009 Jun 15;25(12):i281-8. doi: 10.1093/bioinformatics/btp198.

Protein structure prediction: when is it useful?

Curr Opin Struct Biol. 2009 Apr;19(2):145-55. doi: 10.1016/j.sbi.2009.02.005. Epub 2009 Mar 25.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过大规模的多重环置换来描述蛋白质的现有和潜在结构空间。

Characterizing the existing and potential structural space of proteins by large-scale multiple loop permutations.

机构信息

出版信息

J Mol Biol. 2011 May 6;408(3):585-95. doi: 10.1016/j.jmb.2011.02.056. Epub 2011 Mar 2.

DOI:10.1016/j.jmb.2011.02.056

PMID:21376059

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3075335/

Abstract

摘要

通过大规模的多重环置换来描述蛋白质的现有和潜在结构空间。

Characterizing the existing and potential structural space of proteins by large-scale multiple loop permutations.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

通过大规模的多重环置换来描述蛋白质的现有和潜在结构空间。

Characterizing the existing and potential structural space of proteins by large-scale multiple loop permutations.

机构信息

出版信息