Suppr超能文献

鉴定和量化真菌中的孤儿蛋白质序列。

Identifying and quantifying orphan protein sequences in fungi.

机构信息

Stockholm Bioinformatics Center/Center for Biomembrane Research, Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden.

出版信息

J Mol Biol. 2010 Feb 19;396(2):396-405. doi: 10.1016/j.jmb.2009.11.053. Epub 2009 Nov 26.

Abstract

For large regions of many proteins, and even entire proteins, no homology to known domains or proteins can be detected. These sequences are often referred to as orphans. Surprisingly, it has been reported that the large number of orphans is sustained in spite of a rapid increase of available genomic sequences. However, it is believed that de novo creation of coding sequences is rare in comparison to mechanisms such as domain shuffling and gene duplication; hence, most sequences should have homologs in other genomes. To investigate this, the sequences of 19 complete fungi genomes were compared. By using the phylogenetic relationship between these genomes, we could identify potentially de novo created orphans in Saccharomyces cerevisiae. We found that only a small fraction, <2%, of the S. cerevisiae proteome is orphan, which confirms that de novo creation of coding sequences is indeed rare. Furthermore, we found it necessary to compare the most closely related species to distinguish between de novo created sequences and rapidly evolving sequences where homologs are present but cannot be detected. Next, the orphan proteins (OPs) and orphan domains (ODs) were characterized. First, it was observed that both OPs and ODs are short. In addition, at least some of the OPs have been shown to be functional in experimental assays, showing that they are not pseudogenes. Furthermore, in contrast to what has been reported before and what is seen for older orphans, S. cerevisiae specific ODs and proteins are not more disordered than other proteins. This might indicate that many of the older, and earlier classified, orphans indeed are fast-evolving sequences. Finally, >90% of the detected ODs are located at the protein termini, which suggests that these orphans could have been created by mutations that have affected the start or stop codons.

摘要

对于许多蛋白质的大部分区域,甚至整个蛋白质,都无法检测到与已知结构域或蛋白质的同源性。这些序列通常被称为孤儿。令人惊讶的是,尽管可用基因组序列迅速增加,但据报道,大量孤儿序列仍然存在。然而,与结构域改组和基因复制等机制相比,人们认为从头创建编码序列的情况很少见;因此,大多数序列在其他基因组中应该有同源物。为了研究这一点,比较了 19 个完整真菌基因组的序列。通过使用这些基因组之间的系统发育关系,我们可以确定酿酒酵母中潜在的从头创建的孤儿。我们发现,酿酒酵母蛋白质组中只有一小部分(<2%)是孤儿,这证实了从头创建编码序列确实很少见。此外,我们发现有必要比较最密切相关的物种,以区分从头创建的序列和快速进化的序列,在这些序列中存在同源物,但无法检测到。接下来,对孤儿蛋白(OPs)和孤儿结构域(ODs)进行了特征分析。首先,观察到 OPs 和 ODs 都很短。此外,至少有一些 OPs 在实验检测中表现出功能,表明它们不是假基因。此外,与之前报道的情况和对较老的孤儿的观察结果相反,酿酒酵母特异性的 ODs 和蛋白并不比其他蛋白更无序。这可能表明,许多较老的、早期分类的孤儿实际上是快速进化的序列。最后,检测到的 ODs 中>90%位于蛋白质末端,这表明这些孤儿可能是由影响起始或终止密码子的突变产生的。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验