Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, 2 Research Park, Mailstop 9627, Mississippi, MS 39762, USA.
Delta Research and Extension Center, Mississippi State University, 82 Stoneville Road, P.O. Box 197, Stoneville, MS 38776, USA.
Viruses. 2023 Jul 28;15(8):1643. doi: 10.3390/v15081643.
Analyses of Illumina-based high-throughput sequencing data generated during characterization of the cotton leafroll dwarf virus population in Mississippi (2020-2022) consistently yielded contigs varying in size (most frequently from 4 to 7 kb) with identical nucleotide content and sharing similarities with reverse transcriptases (RTases) encoded by extant plant pararetroviruses (family ). Initial data prompted an in-depth study involving molecular and bioinformatic approaches to characterize the nature and origins of these caulimovirid-like sequences. As a result, here, we report on endogenous viral elements (EVEs) related to extant members of the family integrated into a genome of upland cotton (), for which we propose the provisional name "endogenous cotton pararetroviral elements" (eCPRVE). Our investigations pinpointed a ~15 kbp-long locus on the A04 chromosome consisting of head-to-head orientated tandem copies located on positive- and negative-sense DNA strands (eCPRVE+ and eCPRVE-). Sequences of the eCPRVE+ comprised nearly complete and slightly decayed genome information, including ORFs coding for the viral movement protein (MP), coat protein (CP), RTase, and transactivator/viroplasm protein (TA). Phylogenetic analyses of major viral proteins suggest that the eCPRVE+ may have been initially derived from a genome of a cognate virus belonging to a putative new genus within the family. Unexpectedly, an identical 15 kb-long locus composed of two eCPRVE copies was also detected in a newly recognized species , shedding some light on the relatively recent evolution within the cotton family.
对 2020 年至 2022 年在密西西比州鉴定的棉花卷叶矮缩病毒群体进行的 Illumina 高通量测序数据分析显示,产生的序列大小不一(最常见的是 4 到 7 kb),核苷酸含量相同,与现存的植物反转录病毒(家族)编码的反转录酶(RTase)具有相似性。最初的数据促使我们进行了一项深入的研究,涉及分子和生物信息学方法,以描述这些类似 caulimovirus 的序列的性质和来源。因此,在这里,我们报告了与现存家族成员相关的内源性病毒元件(EVEs)整合到陆地棉()基因组中的情况,我们为此提出了暂定名称“内源性棉花副反转录病毒元件”(eCPRVE)。我们的研究确定了 A04 染色体上一个约 15 kbp 长的基因座,由位于正、负 DNA 链上的头对头排列的串联拷贝组成(eCPRVE+和 eCPRVE-)。eCPRVE+的序列包含几乎完整且略有衰减的基因组信息,包括编码病毒运动蛋白(MP)、外壳蛋白(CP)、RTase 和转录激活/病毒蛋白(TA)的 ORF。主要病毒蛋白的系统发育分析表明,eCPRVE+最初可能来源于属于家族内假定新属的同源病毒的基因组。出乎意料的是,在新鉴定的物种中也检测到了一个由两个 eCPRVE 拷贝组成的相同的 15 kb 长基因座,这为棉花科内的相对近期进化提供了一些线索。