• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过低复杂度三角形评估蛋白质序列的低复杂度。

Assessing the low complexity of protein sequences via the low complexity triangle.

机构信息

Faculty of Biology, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University Mainz, Mainz, Germany.

出版信息

PLoS One. 2020 Dec 30;15(12):e0239154. doi: 10.1371/journal.pone.0239154. eCollection 2020.

DOI:10.1371/journal.pone.0239154
PMID:33378336
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7773278/
Abstract

BACKGROUND

Proteins with low complexity regions (LCRs) have atypical sequence and structural features. Their amino acid composition varies from the expected, determined proteome-wise, and they do not follow the rules of structural folding that prevail in globular regions. One way to characterize these regions is by assessing the repeatability of a sequence, that is, calculating the local propensity of a region to be part of a repeat.

RESULTS

We combine two local measures of low complexity, repeatability (using the RES algorithm) and fraction of the most frequent amino acid, to evaluate different proteomes, datasets of protein regions with specific features, and individual cases of proteins with extreme compositions. We apply a representation called 'low complexity triangle' as a proof-of-concept to represent the low complexity measured values. Results show that proteomes have distinct signatures in the low complexity triangle, and that these signatures are associated to complexity features of the sequences. We developed a web tool called LCT (http://cbdm-01.zdv.uni-mainz.de/~munoz/lct/) to allow users to calculate the low complexity triangle of a given protein or region of interest.

CONCLUSIONS

The low complexity triangle proves to be a suitable procedure to represent the general low complexity of a sequence or protein dataset. Homorepeats, direpeats, compositionally biased regions and globular regions occupy characteristic positions in the triangle. The described pipeline can be used to characterize LCRs and may help in quantifying the content of degenerated tandem repeats in proteins and proteomes.

摘要

背景

具有低复杂度区域 (LCR) 的蛋白质具有非典型的序列和结构特征。它们的氨基酸组成与预期的、基于整个蛋白质组确定的组成不同,并且它们不符合在球状区域中普遍存在的结构折叠规则。一种描述这些区域的方法是评估序列的可重复性,即计算该区域成为重复部分的局部倾向。

结果

我们结合了两种低复杂度的局部度量方法,即可重复性(使用 RES 算法)和最常见氨基酸的分数,来评估不同的蛋白质组、具有特定特征的蛋白质区域数据集以及具有极端组成的单个蛋白质。我们应用一种称为“低复杂度三角形”的表示形式作为概念验证来表示测量的低复杂度值。结果表明,蛋白质组在低复杂度三角形中具有独特的特征,并且这些特征与序列的复杂度特征相关。我们开发了一个名为 LCT(http://cbdm-01.zdv.uni-mainz.de/~munoz/lct/)的网络工具,允许用户计算给定蛋白质或感兴趣区域的低复杂度三角形。

结论

低复杂度三角形被证明是表示序列或蛋白质数据集一般低复杂度的合适程序。同源重复、异源重复、组成性偏向区域和球状区域在三角形中占据特征位置。所描述的流水线可用于表征 LCR,并有助于量化蛋白质和蛋白质组中退化串联重复的含量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e836/7773278/c71d00acabc6/pone.0239154.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e836/7773278/ba0b4dd09bd4/pone.0239154.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e836/7773278/b32629548577/pone.0239154.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e836/7773278/0a6bdc0bbfce/pone.0239154.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e836/7773278/1ec6fb505f3a/pone.0239154.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e836/7773278/c71d00acabc6/pone.0239154.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e836/7773278/ba0b4dd09bd4/pone.0239154.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e836/7773278/b32629548577/pone.0239154.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e836/7773278/0a6bdc0bbfce/pone.0239154.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e836/7773278/1ec6fb505f3a/pone.0239154.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e836/7773278/c71d00acabc6/pone.0239154.g005.jpg

相似文献

1
Assessing the low complexity of protein sequences via the low complexity triangle.通过低复杂度三角形评估蛋白质序列的低复杂度。
PLoS One. 2020 Dec 30;15(12):e0239154. doi: 10.1371/journal.pone.0239154. eCollection 2020.
2
Repeatability in protein sequences.蛋白质序列的可重复性。
J Struct Biol. 2019 Nov 1;208(2):86-91. doi: 10.1016/j.jsb.2019.08.003. Epub 2019 Aug 10.
3
REP2: A Web Server to Detect Common Tandem Repeats in Protein Sequences.REP2:一个用于检测蛋白质序列中常见串联重复的网络服务器。
J Mol Biol. 2021 May 28;433(11):166895. doi: 10.1016/j.jmb.2021.166895. Epub 2021 Feb 24.
4
dAPE: a web server to detect homorepeats and follow their evolution.dAPE:一个用于检测同聚物重复序列并追踪其进化的网络服务器。
Bioinformatics. 2017 Apr 15;33(8):1221-1223. doi: 10.1093/bioinformatics/btw790.
5
HRaP: database of occurrence of HomoRepeats and patterns in proteomes.HRaP:同源重复和蛋白质组模式出现数据库。
Nucleic Acids Res. 2014 Jan;42(Database issue):D273-8. doi: 10.1093/nar/gkt927. Epub 2013 Oct 22.
6
Regions with two amino acids in protein sequences: A step forward from homorepeats into the low complexity landscape.蛋白质序列中含有两个氨基酸的区域:从同聚物重复迈向低复杂性格局的一步。
Comput Struct Biotechnol J. 2022 Sep 18;20:5516-5523. doi: 10.1016/j.csbj.2022.09.011. eCollection 2022.
7
RepSeq--a database of amino acid repeats present in lower eukaryotic pathogens.RepSeq——一个存在于低等真核病原体中的氨基酸重复序列数据库。
BMC Bioinformatics. 2007 Apr 11;8:122. doi: 10.1186/1471-2105-8-122.
8
Low Complexity Regions in Proteins and DNA are Poorly Correlated.蛋白质和 DNA 中的低复杂度区域相关性差。
Mol Biol Evol. 2023 Apr 4;40(4). doi: 10.1093/molbev/msad084.
9
Occurrence of disordered patterns and homorepeats in eukaryotic and bacterial proteomes.真核生物和细菌蛋白质组中无序模式和同型重复序列的出现情况。
Mol Biosyst. 2012 Jan;8(1):327-37. doi: 10.1039/c1mb05318c. Epub 2011 Oct 18.
10
Context characterization of amino acid homorepeats using evolution, position, and order.利用进化、位置和顺序对氨基酸同型重复序列进行上下文特征分析。
Proteins. 2017 Apr;85(4):709-719. doi: 10.1002/prot.25250. Epub 2017 Feb 6.

引用本文的文献

1
Intrinsically Disordered Compositional Bias in Proteins: Sequence Traits, Region Clustering, and Generation of Hypothetical Functional Associations.蛋白质中内在无序的组成偏向性:序列特征、区域聚类及假设功能关联的生成
Bioinform Biol Insights. 2024 Oct 15;18:11779322241287485. doi: 10.1177/11779322241287485. eCollection 2024.
2
Terminal regions of a protein are a hotspot for low complexity regions and selection.蛋白质的末端区域是低复杂度区域和选择的热点。
Open Biol. 2024 Jun;14(6):230439. doi: 10.1098/rsob.230439. Epub 2024 Jun 12.
3
Optimizing strategy for the discovery of compositionally-biased or low-complexity regions in proteins.

本文引用的文献

1
PlaToLoCo: the first web meta-server for visualization and annotation of low complexity regions in proteins.PlaToLoCo:用于可视化和注释蛋白质中低复杂度区域的第一个网络元服务器。
Nucleic Acids Res. 2020 Jul 2;48(W1):W77-W84. doi: 10.1093/nar/gkaa339.
2
Flanking Regions Determine the Structure of the Poly-Glutamine in Huntingtin through Mechanisms Common among Glutamine-Rich Human Proteins.侧翼区域通过富含谷氨酰胺的人类蛋白质共有的机制决定亨廷顿蛋白中多聚谷氨酰胺的结构。
Structure. 2020 Jul 7;28(7):733-746.e5. doi: 10.1016/j.str.2020.04.008. Epub 2020 May 12.
3
Tally-2.0: upgraded validator of tandem repeat detection in protein sequences.
优化发现蛋白质中组成偏向或低复杂度区域的策略。
Sci Rep. 2024 Jan 5;14(1):680. doi: 10.1038/s41598-023-50991-8.
4
fLPS 2.0: rapid annotation of compositionally-biased regions in biological sequences.fLPS 2.0:生物序列中组成性偏向区域的快速注释
PeerJ. 2021 Oct 28;9:e12363. doi: 10.7717/peerj.12363. eCollection 2021.
5
The Conservation of Low Complexity Regions in Bacterial Proteins Depends on the Pathogenicity of the Strain and Subcellular Location of the Protein.细菌蛋白的低复杂度区域的保守性依赖于菌株的致病性和蛋白的亚细胞定位。
Genes (Basel). 2021 Mar 22;12(3):451. doi: 10.3390/genes12030451.
6
The Role of Low Complexity Regions in Protein Interaction Modes: An Illustration in Huntingtin.低复杂度区域在蛋白质相互作用模式中的作用:以亨廷顿蛋白为例。
Int J Mol Sci. 2021 Feb 9;22(4):1727. doi: 10.3390/ijms22041727.
Tally-2.0:蛋白质序列中串联重复检测的升级验证器。
Bioinformatics. 2020 May 1;36(10):3260-3262. doi: 10.1093/bioinformatics/btaa121.
4
DisProt: intrinsic protein disorder annotation in 2020.DisProt:2020 年的内在蛋白无序注释。
Nucleic Acids Res. 2020 Jan 8;48(D1):D269-D276. doi: 10.1093/nar/gkz975.
5
Low complexity regions in the proteins of prokaryotes perform important functional roles and are highly conserved.原核生物蛋白质中的低复杂度区域具有重要的功能作用,并高度保守。
Nucleic Acids Res. 2019 Nov 4;47(19):9998-10009. doi: 10.1093/nar/gkz730.
6
Repeatability in protein sequences.蛋白质序列的可重复性。
J Struct Biol. 2019 Nov 1;208(2):86-91. doi: 10.1016/j.jsb.2019.08.003. Epub 2019 Aug 10.
7
Disentangling the complexity of low complexity proteins.解析低复杂度蛋白质的复杂性。
Brief Bioinform. 2020 Mar 23;21(2):458-472. doi: 10.1093/bib/bbz007.
8
Intrinsic Disorder in Proteins with Pathogenic Repeat Expansions.具有致病重复扩展的蛋白质中的内源性无序
Molecules. 2017 Nov 24;22(12):2027. doi: 10.3390/molecules22122027.
9
MobiDB 3.0: more annotations for intrinsic disorder, conformational diversity and interactions in proteins.MobiDB 3.0:更多关于蛋白质内无序、构象多样性和相互作用的注释。
Nucleic Acids Res. 2018 Jan 4;46(D1):D471-D476. doi: 10.1093/nar/gkx1071.
10
fLPS: Fast discovery of compositional biases for the protein universe.fLPS:蛋白质宇宙组成偏差的快速发现
BMC Bioinformatics. 2017 Nov 13;18(1):476. doi: 10.1186/s12859-017-1906-3.