QColors：一种用于从短且不连续的下一代测序读数中保守重建病毒准种的算法。

QColors: an algorithm for conservative viral quasispecies reconstruction from short and non-contiguous next generation sequencing reads.

作者信息

Huang Austin, Kantor Rami, DeLong Allison, Schreier Leeann, Istrail Sorin

机构信息

Division of Infectious Disease, Computer Science Department, Brown University, Box 1910, Providence, RI 02912, USA.

出版信息

In Silico Biol. 2011;11(5-6):193-201. doi: 10.3233/ISB-2012-0454.

DOI:10.3233/ISB-2012-0454

PMID:23202421

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5530257/

Abstract

Next generation sequencing technologies have recently been applied to characterize mutational spectra of the heterogeneous population of viral genotypes (known as a quasispecies) within HIV-infected patients. Such information is clinically relevant because minority genetic subpopulations of HIV within patients enable viral escape from selection pressures such as the immune response and antiretroviral therapy. However, methods for quasispecies sequence reconstruction from next generation sequencing reads are not yet widely used and remains an emerging area of research. Furthermore, the majority of research methodology in HIV has focused on 454 sequencing, while many next-generation sequencing platforms used in practice are limited to shorter read lengths relative to 454 sequencing. Little work has been done in determining how best to address the read length limitations of other platforms. The approach described here incorporates graph representations of both read differences and read overlap to conservatively determine the regions of the sequence with sufficient variability to separate quasispecies sequences. Within these tractable regions of quasispecies inference, we use constraint programming to solve for an optimal quasispecies subsequence determination via vertex coloring of the conflict graph, a representation which also lends itself to data with non-contiguous reads such as paired-end sequencing. We demonstrate the utility of the method by applying it to simulations based on actual intra-patient clonal HIV-1 sequencing data.

摘要

新一代测序技术最近已被应用于表征HIV感染患者体内病毒基因型的异质群体（称为准种）的突变谱。此类信息具有临床相关性，因为患者体内HIV的少数遗传亚群能够使病毒逃避免疫反应和抗逆转录病毒疗法等选择压力。然而，从新一代测序读数重建准种序列的方法尚未得到广泛应用，仍然是一个新兴的研究领域。此外，HIV研究方法大多集中在454测序上，而实际使用的许多新一代测序平台相对于454测序而言，读长较短。在确定如何最好地解决其他平台读长限制方面，所做的工作很少。这里描述的方法结合了读段差异和读段重叠的图形表示，以保守地确定序列中具有足够变异性以区分准种序列的区域。在这些易于处理的准种推断区域内，我们使用约束规划通过冲突图的顶点着色来求解最优准种子序列的确定，这种表示法也适用于具有非连续读段的数据，如双端测序数据。我们通过将该方法应用于基于患者体内实际HIV-1克隆测序数据的模拟来证明该方法的实用性。

相似文献

QColors: an algorithm for conservative viral quasispecies reconstruction from short and non-contiguous next generation sequencing reads.

In Silico Biol. 2011;11(5-6):193-201. doi: 10.3233/ISB-2012-0454.

Streamlined Subpopulation, Subtype, and Recombination Analysis of HIV-1 Half-Genome Sequences Generated by High-Throughput Sequencing.

mSphere. 2020 Oct 14;5(5):e00551-20. doi: 10.1128/mSphere.00551-20.

aBayesQR: A Bayesian Method for Reconstruction of Viral Populations Characterized by Low Diversity.

J Comput Biol. 2018 Jul;25(7):637-648. doi: 10.1089/cmb.2017.0249. Epub 2018 Feb 26.

Viral quasispecies reconstruction via tensor factorization with successive read removal.

Bioinformatics. 2018 Jul 1;34(13):i23-i31. doi: 10.1093/bioinformatics/bty291.

QSdpR: Viral quasispecies reconstruction via correlation clustering.

Genomics. 2018 Nov;110(6):375-381. doi: 10.1016/j.ygeno.2017.12.007. Epub 2017 Dec 19.

An efficient and scalable graph modeling approach for capturing information at different levels in next generation sequencing reads.

BMC Bioinformatics. 2013;14 Suppl 11(Suppl 11):S7. doi: 10.1186/1471-2105-14-S11-S7. Epub 2013 Nov 4.

Reconstruction of viral population structure from next-generation sequencing data using multicommodity flows.

BMC Bioinformatics. 2013;14 Suppl 9(Suppl 9):S2. doi: 10.1186/1471-2105-14-S9-S2. Epub 2013 Jun 28.

HIV-1 quasispecies delineation by tag linkage deep sequencing.

PLoS One. 2014 May 19;9(5):e97505. doi: 10.1371/journal.pone.0097505. eCollection 2014.

Combinatorial analysis and algorithms for quasispecies reconstruction using next-generation sequencing.

BMC Bioinformatics. 2011 Jan 5;12:5. doi: 10.1186/1471-2105-12-5.

Applying next-generation sequencing to unravel the mutational landscape in viral quasispecies.

Virus Res. 2020 Jul 2;283:197963. doi: 10.1016/j.virusres.2020.197963. Epub 2020 Apr 9.

引用本文的文献

Reconstruction of Microbial Haplotypes by Integration of Statistical and Physical Linkage in Scaffolding.

Mol Biol Evol. 2021 May 19;38(6):2660-2672. doi: 10.1093/molbev/msab037.

Third-order nanocircuit elements for neuromorphic engineering.

Nature. 2020 Sep;585(7826):518-523. doi: 10.1038/s41586-020-2735-5. Epub 2020 Sep 23.

Epidemiological data analysis of viral quasispecies in the next-generation sequencing era.

Brief Bioinform. 2021 Jan 18;22(1):96-108. doi: 10.1093/bib/bbaa101.

Evaluation of haplotype callers for next-generation sequencing of viruses.

Infect Genet Evol. 2020 Aug;82:104277. doi: 10.1016/j.meegid.2020.104277. Epub 2020 Mar 6.

BHap: a novel approach for bacterial haplotype reconstruction.

Bioinformatics. 2019 Nov 1;35(22):4624-4631. doi: 10.1093/bioinformatics/btz280.

De novo assembly of viral quasispecies using overlap graphs.

Genome Res. 2017 May;27(5):835-848. doi: 10.1101/gr.215038.116. Epub 2017 Apr 10.

Estimation of genetic diversity in viral populations from next generation sequencing data with extremely deep coverage.

Algorithms Mol Biol. 2016 Mar 11;11:2. doi: 10.1186/s13015-016-0064-x. eCollection 2016.

Inferring the Clonal Structure of Viral Populations from Time Series Sequencing.

PLoS Comput Biol. 2015 Nov 16;11(11):e1004344. doi: 10.1371/journal.pcbi.1004344. eCollection 2015 Nov.

Frequency-based haplotype reconstruction from deep sequencing data of bacterial populations.

Nucleic Acids Res. 2015 Sep 18;43(16):e105. doi: 10.1093/nar/gkv478. Epub 2015 May 18.

On the complexity of Minimum Path Cover with Subpath Constraints for multi-assembly.

BMC Bioinformatics. 2014;15 Suppl 9(Suppl 9):S5. doi: 10.1186/1471-2105-15-S9-S5. Epub 2014 Sep 10.

本文引用的文献

Ultra-deep sequencing for the analysis of viral populations.

Curr Opin Virol. 2011 Nov;1(5):413-8. doi: 10.1016/j.coviro.2011.07.008. Epub 2011 Aug 17.

Accurate sampling and deep sequencing of the HIV-1 protease gene using a Primer ID.

Proc Natl Acad Sci U S A. 2011 Dec 13;108(50):20166-71. doi: 10.1073/pnas.1110064108. Epub 2011 Nov 30.

QuRe: software for viral quasispecies reconstruction from next-generation sequencing data.

Bioinformatics. 2012 Jan 1;28(1):132-3. doi: 10.1093/bioinformatics/btr627. Epub 2011 Nov 15.

Inferring viral quasispecies spectra from 454 pyrosequencing reads.

BMC Bioinformatics. 2011;12 Suppl 6(Suppl 6):S1. doi: 10.1186/1471-2105-12-S6-S1. Epub 2011 Jul 28.

Performance of ultra-deep pyrosequencing in analysis of HIV-1 pol gene variation.

PLoS One. 2011;6(7):e22741. doi: 10.1371/journal.pone.0022741. Epub 2011 Jul 25.

Detection and quantification of rare mutations with massively parallel sequencing.

Proc Natl Acad Sci U S A. 2011 Jun 7;108(23):9530-5. doi: 10.1073/pnas.1105422108. Epub 2011 May 17.

Sequence-specific error profile of Illumina sequencers.

Nucleic Acids Res. 2011 Jul;39(13):e90. doi: 10.1093/nar/gkr344. Epub 2011 May 16.

ShoRAH: estimating the genetic diversity of a mixed sample from next-generation sequencing data.

BMC Bioinformatics. 2011 Apr 26;12:119. doi: 10.1186/1471-2105-12-119.

Genovo: de novo assembly for metagenomes.

J Comput Biol. 2011 Mar;18(3):429-43. doi: 10.1089/cmb.2010.0244.

Haplotype phasing by multi-assembly of shared haplotypes: phase-dependent interactions between rare variants.

Pac Symp Biocomput. 2011:88-99. doi: 10.1142/9789814335058_0010.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

QColors：一种用于从短且不连续的下一代测序读数中保守重建病毒准种的算法。

QColors: an algorithm for conservative viral quasispecies reconstruction from short and non-contiguous next generation sequencing reads.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献