Suppr超能文献

具有编码聚丙氨酸结构域序列的基因的多态性、共享功能和趋同进化。

Polymorphism, shared functions and convergent evolution of genes with sequences coding for polyalanine domains.

作者信息

Lavoie Hugo, Debeane Francois, Trinh Quoc-Dien, Turcotte Jean-Francois, Corbeil-Girard Louis-Philippe, Dicaire Marie-Josée, Saint-Denis Anik, Pagé Martin, Rouleau Guy A, Brais Bernard

机构信息

Laboratoire de Neurogénétique, Centre de Recherche du Centre Hospitalier de l'Université de Montréal, Québec, Canada.

出版信息

Hum Mol Genet. 2003 Nov 15;12(22):2967-79. doi: 10.1093/hmg/ddg329. Epub 2003 Sep 30.

Abstract

Mutations causing expansions of polyalanine domains are responsible for nine hereditary diseases. Other GC-rich sequences coding for some polyalanine domains were found to be polymorphic in human. These observations prompted us to identify all sequences in the human genome coding for polyalanine stretches longer than four alanines and establish their degree of polymorphism. We identified 494 annotated human proteins containing 604 polyalanine domains. Thirty-two percent (31/98) of tested sequences coding for more than seven alanines were polymorphic. The length of the polyalanine-coding sequence and its GCG or GCC repeat content are the major predictors of polymorphism. GCG codons are over-represented in human polyalanine coding sequences. Our data suggest that GCG and GCC codons play a key role in polyalanine-coding sequence appearance and polymorphism. The grouping by shared function of polyalanine-containing proteins in Homo sapiens, Drosophila melanogaster and Caenorhabditis elegans shows that the majority are involved in transcriptional regulation. Phylogenetic analyses of HOX, GATA and EVX protein families demonstrate that polyalanine domains arose independently in different members of these families, suggesting that convergent molecular evolution may have played a role. Finally polyalanine domains in vertebrates are conserved between mammals and are rarer and shorter in Gallus gallus and Danio rerio. Together our results show that the polymorphic nature of sequences coding for polyalanine domains makes them prime candidates for mutations in hereditary diseases and suggests that they have appeared in many different protein families through convergent evolution.

摘要

导致聚丙氨酸结构域扩增的突变是九种遗传性疾病的病因。人们发现,其他编码某些聚丙氨酸结构域的富含GC的序列在人类中具有多态性。这些观察结果促使我们鉴定人类基因组中所有编码长度超过四个丙氨酸的聚丙氨酸片段的序列,并确定它们的多态性程度。我们鉴定出494种带注释的人类蛋白质,它们含有604个聚丙氨酸结构域。编码超过七个丙氨酸的测试序列中有32%(31/98)具有多态性。聚丙氨酸编码序列的长度及其GCG或GCC重复含量是多态性的主要预测指标。GCG密码子在人类聚丙氨酸编码序列中过度存在。我们的数据表明,GCG和GCC密码子在聚丙氨酸编码序列的出现和多态性中起关键作用。对智人、黑腹果蝇和秀丽隐杆线虫中含聚丙氨酸蛋白质按共享功能进行分组显示,大多数蛋白质参与转录调控。对HOX、GATA和EVX蛋白家族的系统发育分析表明,聚丙氨酸结构域在这些家族的不同成员中独立出现,这表明趋同分子进化可能发挥了作用。最后,脊椎动物中的聚丙氨酸结构域在哺乳动物之间是保守的,而在原鸡和斑马鱼中则较少且较短。我们的研究结果共同表明,编码聚丙氨酸结构域的序列的多态性使其成为遗传性疾病突变的主要候选对象,并表明它们通过趋同进化出现在许多不同的蛋白质家族中。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验