蛋白质结构域组织：增添秩序。

Protein domain organisation: adding order.

作者信息

Kummerfeld Sarah K, Teichmann Sarah A

机构信息

Department of Developmental Biology, 279 Campus Dr, Stanford, 94305, CA, USA.

出版信息

BMC Bioinformatics. 2009 Jan 29;10:39. doi: 10.1186/1471-2105-10-39.

DOI:10.1186/1471-2105-10-39

PMID:19178743

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2657131/

Abstract

BACKGROUND

Domains are the building blocks of proteins. During evolution, they have been duplicated, fused and recombined, to produce proteins with novel structures and functions. Structural and genome-scale studies have shown that pairs or groups of domains observed together in a protein are almost always found in only one N to C terminal order and are the result of a single recombination event that has been propagated by duplication of the multi-domain unit. Previous studies of domain organisation have used graph theory to represent the co-occurrence of domains within proteins. We build on this approach by adding directionality to the graphs and connecting nodes based on their relative order in the protein. Most of the time, the linear order of domains is conserved. However, using the directed graph representation we have identified non-linear features of domain organization that are over-represented in genomes. Recognising these patterns and unravelling how they have arisen may allow us to understand the functional relationships between domains and understand how the protein repertoire has evolved.

RESULTS

We identify groups of domains that are not linearly conserved, but instead have been shuffled during evolution so that they occur in multiple different orders. We consider 192 genomes across all three kingdoms of life and use domain and protein annotation to understand their functional significance. To identify these features and assess their statistical significance, we represent the linear order of domains in proteins as a directed graph and apply graph theoretical methods. We describe two higher-order patterns of domain organisation: clusters and bi-directionally associated domain pairs and explore their functional importance and phylogenetic conservation.

CONCLUSION

Taking into account the order of domains, we have derived a novel picture of global protein organization. We found that all genomes have a higher than expected degree of clustering and more domain pairs in forward and reverse orientation in different proteins relative to random graphs with identical degree distributions. While these features were statistically over-represented, they are still fairly rare. Looking in detail at the proteins involved, we found strong functional relationships within each cluster. In addition, the domains tended to be involved in protein-protein interaction and are able to function as independent structural units. A particularly striking example was the human Jak-STAT signalling pathway which makes use of a set of domains in a range of orders and orientations to provide nuanced signaling functionality. This illustrated the importance of functional and structural constraints (or lack thereof) on domain organisation.

摘要

背景

结构域是蛋白质的构建模块。在进化过程中，它们经历了复制、融合和重组，从而产生具有新结构和功能的蛋白质。结构和基因组规模的研究表明，在蛋白质中共同出现的成对或成组结构域几乎总是仅以一种从N端到C端的顺序被发现，并且是单个重组事件的结果，该事件通过多结构域单元的复制得以传播。先前对结构域组织的研究使用图论来表示蛋白质中结构域的共现情况。我们在此方法的基础上，通过为图添加方向性并根据结构域在蛋白质中的相对顺序连接节点。大多数情况下，结构域的线性顺序是保守的。然而，使用有向图表示法，我们识别出了在基因组中过度呈现的结构域组织的非线性特征。识别这些模式并弄清楚它们是如何产生的，可能使我们能够理解结构域之间的功能关系，并了解蛋白质库是如何进化的。

结果

我们识别出了一些结构域组，它们在进化过程中并非线性保守，而是被洗牌，以至于它们以多种不同顺序出现。我们考虑了生命三界中的192个基因组，并利用结构域和蛋白质注释来理解它们的功能意义。为了识别这些特征并评估它们的统计显著性，我们将蛋白质中结构域的线性顺序表示为有向图，并应用图论方法。我们描述了两种高阶结构域组织模式：簇和双向关联的结构域对，并探讨了它们的功能重要性和系统发育保守性。

结论

考虑到结构域的顺序，我们得出了一幅关于全球蛋白质组织的新图景。我们发现，相对于具有相同度分布的随机图，所有基因组都具有高于预期的聚类程度，并且在不同蛋白质中具有更多正向和反向排列的结构域对。虽然这些特征在统计上过度呈现，但它们仍然相当罕见。详细查看所涉及的蛋白质时，我们发现每个簇内都有很强的功能关系。此外，这些结构域倾向于参与蛋白质-蛋白质相互作用，并且能够作为独立的结构单元发挥作用。一个特别引人注目的例子是人类Jak-STAT信号通路，它利用一组以一系列顺序和方向排列的结构域来提供细微的信号功能。这说明了功能和结构限制（或缺乏这些限制）对结构域组织的重要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b7d/2657131/f04e5bd250a4/1471-2105-10-39-1.jpg

相似文献

Protein domain organisation: adding order.

BMC Bioinformatics. 2009 Jan 29;10:39. doi: 10.1186/1471-2105-10-39.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Macromolecular crowding: chemistry and physics meet biology (Ascona, Switzerland, 10-14 June 2012).

Phys Biol. 2013 Aug;10(4):040301. doi: 10.1088/1478-3975/10/4/040301. Epub 2013 Aug 2.

Supra-domains: evolutionary units larger than single protein domains.

J Mol Biol. 2004 Feb 20;336(3):809-23. doi: 10.1016/j.jmb.2003.12.026.

Multi-domain protein families and domain pairs: comparison with known structures and a random model of domain recombination.

J Struct Funct Genomics. 2003;4(2-3):67-78. doi: 10.1023/a:1026113408773.

An insight into domain combinations.

Bioinformatics. 2001;17 Suppl 1:S83-9. doi: 10.1093/bioinformatics/17.suppl_1.s83.

Domain combinations in archaeal, eubacterial and eukaryotic proteomes.

J Mol Biol. 2001 Jul 6;310(2):311-25. doi: 10.1006/jmbi.2001.4776.

Swaps in protein sequences.

Proteins. 2002 Aug 1;48(2):377-87. doi: 10.1002/prot.10156.

The geometry of domain combination in proteins.

J Mol Biol. 2002 Jan 25;315(4):927-39. doi: 10.1006/jmbi.2001.5288.

Protein families and their evolution-a structural perspective.

Annu Rev Biochem. 2005;74:867-900. doi: 10.1146/annurev.biochem.74.082803.133029.

引用本文的文献

FAS: assessing the similarity between proteins using multi-layered feature architectures.

Bioinformatics. 2023 May 4;39(5). doi: 10.1093/bioinformatics/btad226.

A review of visualisations of protein fold networks and their relationship with sequence and function.

Biol Rev Camb Philos Soc. 2023 Feb;98(1):243-262. doi: 10.1111/brv.12905. Epub 2022 Oct 9.

Identification of New Toxicity Mechanisms in Drug-Induced Liver Injury through Systems Pharmacology.

Genes (Basel). 2022 Jul 21;13(7):1292. doi: 10.3390/genes13071292.

Fusion of two unrelated protein domains in a chimera protein and its 3D prediction: Justification of the x-ray reference structures as a prediction benchmark.

Proteins. 2022 Dec;90(12):2067-2079. doi: 10.1002/prot.26398. Epub 2022 Jul 27.

Discovering molecular features of intrinsically disordered regions by using evolution for contrastive learning.

PLoS Comput Biol. 2022 Jun 29;18(6):e1010238. doi: 10.1371/journal.pcbi.1010238. eCollection 2022 Jun.

Simulating domain architecture evolution.

Bioinformatics. 2022 Jun 24;38(Suppl 1):i134-i142. doi: 10.1093/bioinformatics/btac242.

Systematic analysis of CCCH zinc finger family in Brassica napus showed that BnRR-TZFs are involved in stress resistance.

BMC Plant Biol. 2021 Nov 23;21(1):555. doi: 10.1186/s12870-021-03340-8.

GrAPFI: predicting enzymatic function of proteins from domain similarity graphs.

BMC Bioinformatics. 2020 Apr 29;21(1):168. doi: 10.1186/s12859-020-3460-7.

The modular nature of protein evolution: domain rearrangement rates across eukaryotic life.

BMC Evol Biol. 2020 Feb 14;20(1):30. doi: 10.1186/s12862-020-1591-0.

DIFFUSE: predicting isoform functions from sequences and expression profiles via deep learning.

Bioinformatics. 2019 Jul 15;35(14):i284-i294. doi: 10.1093/bioinformatics/btz367.

本文引用的文献

Just how versatile are domains?

BMC Evol Biol. 2008 Oct 14;8:285. doi: 10.1186/1471-2148-8-285.

Evolution of protein domain promiscuity in eukaryotes.

Genome Res. 2008 Mar;18(3):449-61. doi: 10.1101/gr.6943508. Epub 2008 Jan 29.

Domain tree-based analysis of protein architecture evolution.

Mol Biol Evol. 2008 Feb;25(2):254-64. doi: 10.1093/molbev/msm254. Epub 2007 Nov 19.

Comprehensive analysis of co-occurring domain sets in yeast proteins.

BMC Genomics. 2007 Jun 11;8:161. doi: 10.1186/1471-2164-8-161.

Domain deletions and substitutions in the modular protein evolution.

FEBS J. 2006 May;273(9):2037-47. doi: 10.1111/j.1742-4658.2006.05220.x.

Graph theoretical insights into evolution of multidomain proteins.

J Comput Biol. 2006 Mar;13(2):351-63. doi: 10.1089/cmb.2006.13.351.

Protein families and their evolution-a structural perspective.

Annu Rev Biochem. 2005;74:867-900. doi: 10.1146/annurev.biochem.74.082803.133029.

Evolutionary cores of domain co-occurrence networks.

BMC Evol Biol. 2005 Mar 23;5:24. doi: 10.1186/1471-2148-5-24.

Convergent evolution of domain architectures (is rare).

Bioinformatics. 2005 Apr 15;21(8):1464-71. doi: 10.1093/bioinformatics/bti204. Epub 2004 Dec 7.

Supra-domains: evolutionary units larger than single protein domains.

J Mol Biol. 2004 Feb 20;336(3):809-23. doi: 10.1016/j.jmb.2003.12.026.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

蛋白质结构域组织：增添秩序。

Protein domain organisation: adding order.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献