Department of Medicine, University of Cambridge, Addenbrooke's Hospital, Cambridge, United Kingdom.
Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Centre for Mathematical Sciences, Cambridge, United Kingdom.
PLoS Comput Biol. 2019 Sep 23;15(9):e1007345. doi: 10.1371/journal.pcbi.1007345. eCollection 2019 Sep.
HIV-1 replicates via a low-fidelity polymerase with a high mutation rate; strong conservation of individual nucleotides is highly indicative of the presence of critical structural or functional properties. Identifying such conservation can reveal novel insights into viral behaviour. We analysed 3651 publicly available sequences for the presence of nucleic acid conservation beyond that required by amino acid constraints, using a novel scale-free method that identifies regions of outlying score together with a codon scoring algorithm. Sequences with outlying score were further analysed using an algorithm for producing local RNA folds whilst accounting for alignment properties. 11 different conserved regions were identified, some corresponding to well-known cis-acting functions of the HIV-1 genome but also others whose conservation has not previously been noted. We identify rational causes for many of these, including cis functions, possible additional reading frame usage, a plausible mechanism by which the central polypurine tract primes second-strand DNA synthesis and a conformational stabilising function of a region at the 5' end of env.
HIV-1 通过具有高突变率的低保真度聚合酶进行复制;单个核苷酸的强烈保守性高度表明存在关键的结构或功能特性。识别这种保守性可以揭示病毒行为的新见解。我们使用一种新的无标度方法,分析了 3651 个公开可用的序列,以确定超出氨基酸限制所需的核酸保守性,该方法可识别异常得分区域以及密码子评分算法。使用一种用于生成局部 RNA 折叠的算法进一步分析得分异常的序列,同时考虑到对齐特性。确定了 11 个不同的保守区域,其中一些与 HIV-1 基因组的已知顺式作用功能相对应,但也有一些以前没有注意到其保守性。我们为其中的许多保守性确定了合理的原因,包括顺式功能、可能的额外阅读框使用、中央多嘧啶序列引发第二链 DNA 合成的合理机制以及 env 端 5' 区域的构象稳定功能。