Department of Medicine, Division of Infectious Diseases, Stanford University, Stanford, CA 94305, USA.
Bioinformatics. 2010 Dec 1;26(23):2929-32. doi: 10.1093/bioinformatics/btq570. Epub 2010 Oct 11.
G → A hypermutation is an innate antiviral defense mechanism, mediated by host enzymes, which leads to the mutational impairment of viruses. Sensitive and specific identification of host-mediated G → A hypermutation is a novel sequence analysis challenge, particularly for viral deep sequencing studies. For example, two of the most common hepatitis B virus (HBV) reverse transcriptase (RT) drug-resistance mutations, A181T and M204I, arise from G → A changes and are routinely detected as low-abundance variants in nearly all HBV deep sequencing samples.
We developed a classification model using measures of G → A excess and predicted indicators of lethal mutation and applied this model to 325 920 unique deep sequencing reads from plasma virus samples from 45 drug treatment-naïve HBV-infected individuals. The 2.9% of sequence reads that were classified as hypermutated by our model included most of the reads with A181T and/or M204I, indicating the usefulness of this model for distinguishing viral adaptive changes from host-mediated viral editing.
Source code and sequence data are available at http://hivdb.stanford.edu/pages/resources.html.
Supplementary data are available at Bioinformatics online.
G→A 超突变是一种由宿主酶介导的先天抗病毒防御机制,导致病毒的突变损伤。宿主介导的 G→A 超突变的敏感和特异性识别是一种新的序列分析挑战,特别是对于病毒深度测序研究。例如,两种最常见的乙型肝炎病毒 (HBV) 逆转录酶 (RT) 耐药突变,A181T 和 M204I,源自 G→A 变化,并且在几乎所有 HBV 深度测序样本中通常以低丰度变体检测到。
我们使用 G→A 过剩的度量标准和预测致命突变的指标开发了一个分类模型,并将该模型应用于来自 45 名未接受药物治疗的 HBV 感染个体的血浆病毒样本的 325920 个独特的深度测序读段。我们的模型分类为超突变的 2.9%的序列读段包括大多数具有 A181T 和/或 M204I 的读段,表明该模型可用于区分病毒适应性变化与宿主介导的病毒编辑。
源代码和序列数据可在 http://hivdb.stanford.edu/pages/resources.html 获得。
补充资料可在生物信息学在线获得。