Laboratory of RNA Viruses, Institute for Frontier Life and Medical Sciences, Kyoto University, Kyoto 606-8507, Japan.
Department of Computer and Network Engineering, Graduate School of Informatics and Engineering, The University of Electro-Communications, Tokyo 182-8585, Japan.
Proc Natl Acad Sci U S A. 2021 Feb 2;118(5). doi: 10.1073/pnas.2010758118.
Understanding the genetics and taxonomy of ancient viruses will give us great insights into not only the origin and evolution of viruses but also how viral infections played roles in our evolution. Endogenous viruses are remnants of ancient viral infections and are thought to retain the genetic characteristics of viruses from ancient times. In this study, we used machine learning of endogenous RNA virus sequence signatures to identify viruses in the human genome that have not been detected or are already extinct. Here, we show that the -mer occurrence of ancient RNA viral sequences remains similar to that of extant RNA viral sequences and can be differentiated from that of other human genome sequences. Furthermore, using this characteristic, we screened RNA viral insertions in the human reference genome and found virus-like insertions with phylogenetic and evolutionary features indicative of an exogenous origin but lacking homology to previously identified sequences. Our analysis indicates that animal genomes still contain unknown virus-derived sequences and provides a glimpse into the diversity of the ancient virosphere.
了解古代病毒的遗传学和分类学,不仅可以让我们深入了解病毒的起源和进化,还可以让我们了解病毒感染在人类进化过程中所扮演的角色。内源性病毒是古代病毒感染的残余物,被认为保留了古代病毒的遗传特征。在这项研究中,我们使用机器学习的方法来识别人类基因组中未被检测到或已经灭绝的内源性 RNA 病毒序列。研究结果表明,古老 RNA 病毒序列的 -mer 出现频率与现存 RNA 病毒序列相似,并且可以与其他人类基因组序列区分开来。此外,我们利用这一特征对人类参考基因组中的 RNA 病毒插入进行了筛选,发现了具有进化特征的病毒样插入,这些插入的起源是外源性的,但与之前鉴定的序列没有同源性。我们的分析表明,动物基因组中仍然存在未知的病毒衍生序列,这让我们得以一窥古代病毒圈的多样性。