• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

对编码具有单氨基酸重复序列蛋白质的基因的基因组学和进化分析

Genomic and evolutionary insights into genes encoding proteins with single amino acid repeats.

作者信息

Siwach Pratibha, Pophaly Saurabh Dilip, Ganesh Subramaniam

机构信息

Department of Biological Sciences and Bioengineering, Indian Institute of Technology, Kanpur, India.

出版信息

Mol Biol Evol. 2006 Jul;23(7):1357-69. doi: 10.1093/molbev/msk022. Epub 2006 Apr 17.

DOI:10.1093/molbev/msk022
PMID:16618963
Abstract

Mutations causing expansion of amino acid repeats are responsible for 19 hereditary disorders. Repeats in several other proteins also show length variations. These observations prompted us to identify single amino acid repeat-containing proteins (SARPs) in humans and to understand their functional and evolutionary significance. We identified 8812 SARPs containing 17 146 repeat domains, each harboring 4 or more residues. In all, 5% of SARPs (471) showed repeat length variations, and nearly 84% of them (394) have repeats of 10 residues or less. We find that SARPs are involved in functions that require formation of multiprotein complexes. Nearly 78% (6859) of the SARPs did not find a paralogue in the human proteome, and such proteins are considered as orphan SARPs. Orphan SARPs show longer repeat stretches, longer peptide length, and lower expression levels as compared with SARPs belonging to protein family. Because the intensity of gene expression is known to relate inversely with the rate of protein sequence evolution, our results suggest that the orphan SARPs evolve faster than the familial forms and therefore are under a weaker selection pressure. We also find that while GC-rich codons are favored for coding the repeat tracts of SARPs, specific codons and not nucleotide motifs per se are selected, suggesting functional constraints placed on the usage of codons. One of the constraints could be the mRNA stability as clustering of rare codons is known to destabilize the transcripts and rare codons are not favored for coding repeat tracts. Genes encoding polymorphic SARPs show preferential localization toward the telomeric segments. Further, the sex-specific recombination rates of the chromosomal locus strongly correlate with the parental gender that influence the repeat instability in disorder caused by dynamic mutation. Therefore, instability associated with repeats might be driven by processes that are specific to sperm or oocyte development, and the recombination frequency might play a positive role in this process.

摘要

导致氨基酸重复序列扩增的突变是19种遗传性疾病的病因。其他几种蛋白质中的重复序列也存在长度变异。这些观察结果促使我们在人类中鉴定含单氨基酸重复序列的蛋白质(SARP),并了解它们的功能和进化意义。我们鉴定出8812个SARP,包含17146个重复结构域,每个结构域含有4个或更多残基。总体而言,5%的SARP(471个)表现出重复长度变异,其中近84%(394个)的重复序列长度为10个残基或更短。我们发现SARP参与需要形成多蛋白复合物的功能。近78%(6859个)的SARP在人类蛋白质组中未找到旁系同源物,这类蛋白质被视为孤儿SARP。与属于蛋白质家族的SARP相比,孤儿SARP的重复序列延伸更长、肽长度更长且表达水平更低。由于已知基因表达强度与蛋白质序列进化速率成反比,我们的结果表明孤儿SARP的进化速度比家族形式更快,因此处于较弱的选择压力之下。我们还发现,虽然富含GC的密码子有利于编码SARP的重复序列,但选择的是特定密码子而非核苷酸基序本身,这表明密码子的使用受到功能限制。其中一个限制可能是mRNA稳定性,因为已知稀有密码子的聚集会使转录本不稳定,且稀有密码子不利于编码重复序列。编码多态性SARP的基因表现出向端粒区段的优先定位。此外,染色体位点的性别特异性重组率与影响动态突变所致疾病中重复序列不稳定性的亲本性别密切相关。因此,与重复序列相关的不稳定性可能由精子或卵母细胞发育特有的过程驱动,而重组频率可能在此过程中发挥积极作用。

相似文献

1
Genomic and evolutionary insights into genes encoding proteins with single amino acid repeats.对编码具有单氨基酸重复序列蛋白质的基因的基因组学和进化分析
Mol Biol Evol. 2006 Jul;23(7):1357-69. doi: 10.1093/molbev/msk022. Epub 2006 Apr 17.
2
Polymorphism, shared functions and convergent evolution of genes with sequences coding for polyalanine domains.具有编码聚丙氨酸结构域序列的基因的多态性、共享功能和趋同进化。
Hum Mol Genet. 2003 Nov 15;12(22):2967-79. doi: 10.1093/hmg/ddg329. Epub 2003 Sep 30.
3
Simple sequence repeats in proteins and their significance for network evolution.蛋白质中的简单序列重复及其对网络进化的意义。
Gene. 2005 Jan 17;345(1):113-8. doi: 10.1016/j.gene.2004.11.023. Epub 2004 Dec 15.
4
Highly constrained proteins contain an unexpectedly large number of amino acid tandem repeats.高度受限的蛋白质含有数量出乎意料的大量氨基酸串联重复序列。
Genomics. 2007 Mar;89(3):316-25. doi: 10.1016/j.ygeno.2006.11.011. Epub 2006 Dec 28.
5
Length variation of CAG/CAA triplet repeats in 50 genes among 16 inbred mouse strains.16种近交系小鼠品系中50个基因的CAG/CAA三联体重复序列的长度变异
Gene. 2005 Apr 11;349:107-19. doi: 10.1016/j.gene.2004.11.050.
6
Comparative analysis of amino acid repeats in rodents and humans.啮齿动物和人类中氨基酸重复序列的比较分析。
Genome Res. 2004 Apr;14(4):549-54. doi: 10.1101/gr.1925704.
7
Spatial positions of homopolymeric repeats in the human proteome and their effect on cellular toxicity.人类蛋白质组中同聚物重复序列的空间位置及其对细胞毒性的影响。
Biochem Biophys Res Commun. 2009 Mar 6;380(2):382-6. doi: 10.1016/j.bbrc.2009.01.101. Epub 2009 Jan 23.
8
Satellog: a database for the identification and prioritization of satellite repeats in disease association studies.Satellog:一个用于疾病关联研究中卫星重复序列识别和优先级排序的数据库。
BMC Bioinformatics. 2005 Jun 10;6:145. doi: 10.1186/1471-2105-6-145.
9
RCPdb: An evolutionary classification and codon usage database for repeat-containing proteins.RCPdb:一个用于含重复序列蛋白质的进化分类和密码子使用数据库。
Genome Res. 2007 Jul;17(7):1118-27. doi: 10.1101/gr.6255407. Epub 2007 Jun 13.
10
A census of protein repeats.蛋白质重复序列普查。
J Mol Biol. 1999 Oct 15;293(1):151-60. doi: 10.1006/jmbi.1999.3136.

引用本文的文献

1
Novel primers drive accurate SYBR Green PCR detection of Listeria monocytogenes and Listeria innocua in cultures and mushrooms.新型引物可实现对培养物和蘑菇中单核细胞增生李斯特菌和无害李斯特菌的准确SYBR Green PCR检测。
Sci Rep. 2025 Jan 8;15(1):1357. doi: 10.1038/s41598-024-81508-6.
2
An innovative approach to decoding genetic variability in Pseudomonas aeruginosa via amino acid repeats and gene structure profiles.通过氨基酸重复和基因结构谱对铜绿假单胞菌遗传变异性进行解码的创新方法。
Sci Rep. 2024 Sep 30;14(1):22610. doi: 10.1038/s41598-024-73031-5.
3
Terminal regions of a protein are a hotspot for low complexity regions and selection.
蛋白质的末端区域是低复杂度区域和选择的热点。
Open Biol. 2024 Jun;14(6):230439. doi: 10.1098/rsob.230439. Epub 2024 Jun 12.
4
The Presence of Two Genes in a Subset of Acanthopterygii Fish Is Associated with a Polyserine Insert in MyoD1.棘鳍类鱼亚组中两个基因的存在与MyoD1中的多聚丝氨酸插入有关。
J Dev Biol. 2023 Apr 28;11(2):19. doi: 10.3390/jdb11020019.
5
Low complexity regions in the proteins of prokaryotes perform important functional roles and are highly conserved.原核生物蛋白质中的低复杂度区域具有重要的功能作用,并高度保守。
Nucleic Acids Res. 2019 Nov 4;47(19):9998-10009. doi: 10.1093/nar/gkz730.
6
Selection pressure on human STR loci and its relevance in repeat expansion disease.人类短串联重复序列(STR)位点的选择压力及其在重复序列扩增疾病中的相关性。
Mol Genet Genomics. 2016 Oct;291(5):1851-69. doi: 10.1007/s00438-016-1219-7. Epub 2016 Jun 11.
7
Tandem amino acid repeats in the green anole (Anolis carolinensis) and other squamates may have a role in increasing genetic variability.绿安乐蜥(Anolis carolinensis)及其他有鳞目动物中的串联氨基酸重复序列可能在增加遗传变异性方面发挥作用。
BMC Genomics. 2016 Feb 12;17:109. doi: 10.1186/s12864-016-2430-y.
8
Expansion of polyalanine tracts in the QA domain may play a critical role in the clavicular development of cleidocranial dysplasia.QA结构域中聚丙氨酸序列的扩展可能在锁骨颅骨发育不全的锁骨发育中起关键作用。
J Genet. 2015 Sep;94(3):551-3. doi: 10.1007/s12041-015-0551-8.
9
Association of polyalanine and polyglutamine coiled coils mediates expansion disease-related protein aggregation and dysfunction.聚丙氨酸和聚谷氨酰胺卷曲螺旋的缔合介导了扩张性疾病相关蛋白的聚集和功能障碍。
Hum Mol Genet. 2014 Jul 1;23(13):3402-20. doi: 10.1093/hmg/ddu049. Epub 2014 Feb 4.
10
Adaptive genetic markers discriminate migratory runs of Chinook salmon (Oncorhynchus tshawytscha) amid continued gene flow.适应性遗传标记可区分在持续基因流中红大马哈鱼(Oncorhynchus tshawytscha)的洄游群体。
Evol Appl. 2013 Dec;6(8):1184-94. doi: 10.1111/eva.12095. Epub 2013 Sep 10.