• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用自同源点图发现和分析蛋白质中的重复和低复杂度结构及其保守的进化关系。

Discovery and Analysis of Repeat and Low-Complexity Architectures in Proteins and Their Conserved Evolutionary Relationships Using Self-Homology Dot Plots.

机构信息

Structural Biology Group, Biological and Chemical Research Centre, Faculty of Chemistry, University of Warsaw, Warsaw, Poland.

i3S - Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Porto, Portugal.

出版信息

Methods Mol Biol. 2025;2870:95-116. doi: 10.1007/978-1-0716-4213-9_7.

DOI:10.1007/978-1-0716-4213-9_7
Abstract

Proteins that contain sequence repetitions and low complexity regions can be analyzed using self-homology dot plot analysis. Dot plots can readily identify protein sequence repeats; the number of repeats and their length and location within the protein sequence are readily identifiable from the dot plots without the need to pre-define any of these attributes, making this method largely model-independent. We discuss the criteria for statistical identification of protein repeats and recommend simple ways of identifying protein repeats. While higher levels of sequence conservation within the repeats do make them easier to formally identify, this method can identify protein repeats with fairly low levels of conservation, as well as notably non-tandem repetitions with sizeable sections of complex, non-repeat sequence separating the individual repeat instances. Furthermore, even simple visual examination of these dot plots can discover conserved patterns within families of closely related proteins, and the level of this conservation can be readily quantified using a Jaccard index. Exhaustive pairwise comparisons can be assembled using hierarchical clustering methods to get a picture of the conserved repeat architectures within families of repeat proteins.

摘要

含有序列重复和低复杂度区域的蛋白质可以使用自同源点图分析进行分析。点图可以很容易地识别蛋白质序列重复;重复的数量及其在蛋白质序列中的长度和位置可以从点图中轻松识别,而无需预先定义这些属性中的任何一个,这使得该方法在很大程度上不受模型的限制。我们讨论了蛋白质重复的统计识别标准,并推荐了识别蛋白质重复的简单方法。虽然重复序列内更高的序列保守性确实使它们更容易正式识别,但该方法可以识别保守性相当低的蛋白质重复,以及明显非串联重复,其中单独的重复实例之间有相当大的复杂非重复序列部分。此外,即使是对这些点图的简单目视检查也可以发现密切相关蛋白质家族内的保守模式,并且可以使用 Jaccard 指数轻松量化这种保守程度。可以使用层次聚类方法来组装详尽的成对比较,以了解重复蛋白家族内的保守重复结构。

相似文献

1
Discovery and Analysis of Repeat and Low-Complexity Architectures in Proteins and Their Conserved Evolutionary Relationships Using Self-Homology Dot Plots.使用自同源点图发现和分析蛋白质中的重复和低复杂度结构及其保守的进化关系。
Methods Mol Biol. 2025;2870:95-116. doi: 10.1007/978-1-0716-4213-9_7.
2
Self-analysis of repeat proteins reveals evolutionarily conserved patterns.重复蛋白质的自我分析揭示了进化上保守的模式。
BMC Bioinformatics. 2020 May 7;21(1):179. doi: 10.1186/s12859-020-3493-y.
3
A Graph-Based Approach for Detecting Sequence Homology in Highly Diverged Repeat Protein Families.一种基于图形的方法用于检测高度分化的重复蛋白家族中的序列同源性。
Methods Mol Biol. 2019;1851:251-261. doi: 10.1007/978-1-4939-8736-8_13.
4
GATA: a graphic alignment tool for comparative sequence analysis.GATA:一种用于比较序列分析的图形比对工具。
BMC Bioinformatics. 2005 Jan 17;6:9. doi: 10.1186/1471-2105-6-9.
5
ProtRepeatsDB: a database of amino acid repeats in genomes.ProtRepeatsDB:基因组中氨基酸重复序列数据库。
BMC Bioinformatics. 2006 Jul 7;7:336. doi: 10.1186/1471-2105-7-336.
6
ProRepeat: an integrated repository for studying amino acid tandem repeats in proteins.ProRepeat:一个用于研究蛋白质中氨基酸串联重复的综合数据库。
Nucleic Acids Res. 2012 Jan;40(Database issue):D394-9. doi: 10.1093/nar/gkr1019. Epub 2011 Nov 18.
7
Identification and characterization of tandem repeats in exon III of dopamine receptor D4 (DRD4) genes from different mammalian species.不同哺乳动物物种多巴胺受体D4(DRD4)基因外显子III中串联重复序列的鉴定与表征。
DNA Cell Biol. 2005 Dec;24(12):795-804. doi: 10.1089/dna.2005.24.795.
8
Joint evolutionary trees: a large-scale method to predict protein interfaces based on sequence sampling.联合进化树:一种基于序列采样预测蛋白质界面的大规模方法。
PLoS Comput Biol. 2009 Jan;5(1):e1000267. doi: 10.1371/journal.pcbi.1000267. Epub 2009 Jan 23.
9
Insights into the domain and repeat architecture of target of rapamycin.雷帕霉素靶蛋白的结构域和重复结构的研究进展
J Struct Biol. 2010 May;170(2):354-63. doi: 10.1016/j.jsb.2010.01.002. Epub 2010 Jan 11.
10
Comparison of ARM and HEAT protein repeats.ARM与HEAT蛋白重复序列的比较。
J Mol Biol. 2001 May 25;309(1):1-18. doi: 10.1006/jmbi.2001.4624.

本文引用的文献

1
UCSF ChimeraX: Tools for structure building and analysis.UCSF ChimeraX:结构构建和分析工具。
Protein Sci. 2023 Nov;32(11):e4792. doi: 10.1002/pro.4792.
2
The Effect of Mutations in the TPR and Ankyrin Families of Alpha Solenoid Repeat Proteins.α-螺线管重复蛋白的TPR和锚蛋白家族突变的影响
Front Bioinform. 2021 Jul 6;1:696368. doi: 10.3389/fbinf.2021.696368. eCollection 2021.
3
Search and sequence analysis tools services from EMBL-EBI in 2022.2022 年 EMBL-EBI 的搜索和序列分析工具服务。
Nucleic Acids Res. 2022 Jul 5;50(W1):W276-W279. doi: 10.1093/nar/gkac240.
4
Highly accurate protein structure prediction with AlphaFold.利用 AlphaFold 进行高精度蛋白质结构预测。
Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.
5
RepeatsDB in 2021: improved data and extended classification for protein tandem repeat structures.2021 年的 RepeatsDB:改进了蛋白质串联重复结构的数据并扩展了分类。
Nucleic Acids Res. 2021 Jan 8;49(D1):D452-D457. doi: 10.1093/nar/gkaa1097.
6
Large Ankyrin repeat proteins are formed with similar and energetically favorable units.大锚蛋白重复蛋白由相似且能量有利的结构域形成。
PLoS One. 2020 Jun 24;15(6):e0233865. doi: 10.1371/journal.pone.0233865. eCollection 2020.
7
PlaToLoCo: the first web meta-server for visualization and annotation of low complexity regions in proteins.PlaToLoCo:用于可视化和注释蛋白质中低复杂度区域的第一个网络元服务器。
Nucleic Acids Res. 2020 Jul 2;48(W1):W77-W84. doi: 10.1093/nar/gkaa339.
8
Self-analysis of repeat proteins reveals evolutionarily conserved patterns.重复蛋白质的自我分析揭示了进化上保守的模式。
BMC Bioinformatics. 2020 May 7;21(1):179. doi: 10.1186/s12859-020-3493-y.
9
Decoupling a tandem-repeat protein: Impact of multiple loop insertions on a modular scaffold.解偶联串联重复蛋白:多个环插入对模块化支架的影响。
Sci Rep. 2019 Oct 28;9(1):15439. doi: 10.1038/s41598-019-49905-4.
10
Protein intrinsic disorder and structure-function continuum.蛋白质的内源性无序与结构-功能连续体。
Prog Mol Biol Transl Sci. 2019;166:1-17. doi: 10.1016/bs.pmbts.2019.05.003. Epub 2019 Jun 8.