Suppr超能文献

对鸟类基因组中的难治区域进行测序是加速蛋白质进化的热点。

Sequencing refractory regions in bird genomes are hotspots for accelerated protein evolution.

机构信息

Gene Expression Unit, Department of Cellular and Molecular Medicine, KU Leuven, Herestraat 49, O&N1, bus 901, 3000, Leuven, Belgium.

Tissue Engineering Laboratory, Department of Development and Regeneration, KU Leuven Campus Kulak, Kortrijk, Belgium.

出版信息

BMC Ecol Evol. 2021 Sep 18;21(1):176. doi: 10.1186/s12862-021-01905-7.

Abstract

BACKGROUND

Approximately 1000 protein encoding genes common for vertebrates are still unannotated in avian genomes. Are these genes evolutionary lost or are they not yet found for technical reasons? Using genome landscapes as a tool to visualize large-scale regional effects of genome evolution, we reexamined this question.

RESULTS

On basis of gene annotation in non-avian vertebrate genomes, we established a list of 15,135 common vertebrate genes. Of these, 1026 were not found in any of eight examined bird genomes. Visualizing regional genome effects by our sliding window approach showed that the majority of these "missing" genes can be clustered to 14 regions of the human reference genome. In these clusters, an additional 1517 genes (often gene fragments) were underrepresented in bird genomes. The clusters of "missing" genes coincided with regions of very high GC content, particularly in avian genomes, making them "hidden" because of incomplete sequencing. Moreover, proteins encoded by genes in these sequencing refractory regions showed signs of accelerated protein evolution. As a proof of principle for this idea we experimentally characterized the mRNA and protein products of four "hidden" bird genes that are crucial for energy homeostasis in skeletal muscle: ALDOA, ENO3, PYGM and SLC2A4.

CONCLUSIONS

A least part of the "missing" genes in bird genomes can be attributed to an artifact caused by the difficulty to sequence regions with extreme GC% ("hidden" genes). Biologically, these "hidden" genes are of interest as they encode proteins that evolve more rapidly than the genome wide average. Finally we show that four of these "hidden" genes encode key proteins for energy metabolism in flight muscle.

摘要

背景

大约有 1000 个编码蛋白质的基因在禽类基因组中仍然没有注释。这些基因是由于进化而丢失的,还是由于技术原因尚未被发现?我们利用基因组景观作为一种工具来可视化大规模的基因组进化区域效应,重新检验了这个问题。

结果

基于非禽类脊椎动物基因组中的基因注释,我们建立了一个包含 15135 个常见脊椎动物基因的列表。在这 15135 个基因中,有 1026 个在 8 种鸟类基因组中都没有发现。通过我们的滑动窗口方法可视化区域基因组效应表明,这些“缺失”基因的大部分可以聚类到人类参考基因组的 14 个区域中。在这些聚类中,鸟类基因组中还存在另外 1517 个基因(通常是基因片段)的代表性不足。缺失基因的聚类与 GC 含量非常高的区域重合,特别是在鸟类基因组中,由于测序不完整,这些区域成为“隐藏”的。此外,这些测序困难区域中基因编码的蛋白质显示出加速蛋白质进化的迹象。作为这个想法的原理证明,我们实验表征了四个对骨骼肌能量平衡至关重要的“隐藏”鸟类基因的 mRNA 和蛋白质产物:ALDOA、ENO3、PYGM 和 SLC2A4。

结论

鸟类基因组中“缺失”的基因至少有一部分可以归因于由于 GC%极端而难以测序的区域造成的假象(“隐藏”基因)。从生物学角度来看,这些“隐藏”基因很有趣,因为它们编码的蛋白质比基因组平均进化速度更快。最后,我们表明其中四个“隐藏”基因编码飞行肌肉能量代谢的关键蛋白。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/484a/8449477/02ea41f708f5/12862_2021_1905_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验