Suppr
超能文献

利用 Segminator II 分析高通量测序数据以研究病毒多样性：下一代测序平台的比较。

Analysis of high-depth sequence data for studying viral diversity: a comparison of next generation sequencing platforms using Segminator II.

机构信息

Computational and Evolutionary Biology, Faculty of Life Sciences, University of Manchester, Manchester, UK.

出版信息

BMC Bioinformatics. 2012 Mar 23;13:47. doi: 10.1186/1471-2105-13-47.

DOI:10.1186/1471-2105-13-47

PMID:22443413

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3359224/

Abstract

BACKGROUND

Next generation sequencing provides detailed insight into the variation present within viral populations, introducing the possibility of treatment strategies that are both reactive and predictive. Current software tools, however, need to be scaled up to accommodate for high-depth viral data sets, which are often temporally or spatially linked. In addition, due to the development of novel sequencing platforms and chemistries, each with implicit strengths and weaknesses, it will be helpful for researchers to be able to routinely compare and combine data sets from different platforms/chemistries. In particular, error associated with a specific sequencing process must be quantified so that true biological variation may be identified.

RESULTS

Segminator II was developed to allow for the efficient comparison of data sets derived from different sources. We demonstrate its usage by comparing large data sets from 12 influenza H1N1 samples sequenced on both the 454 Life Sciences and Illumina platforms, permitting quantification of platform error. For mismatches median error rates at 0.10 and 0.12%, respectively, suggested that both platforms performed similarly. For insertions and deletions median error rates within the 454 data (at 0.3 and 0.2%, respectively) were significantly higher than those within the Illumina data (0.004 and 0.006%, respectively). In agreement with previous observations these higher rates were strongly associated with homopolymeric stretches on the 454 platform. Outside of such regions both platforms had similar indel error profiles. Additionally, we apply our software to the identification of low frequency variants.

CONCLUSION

We have demonstrated, using Segminator II, that it is possible to distinguish platform specific error from biological variation using data derived from two different platforms. We have used this approach to quantify the amount of error present within the 454 and Illumina platforms in relation to genomic location as well as location on the read. Given that next generation data is increasingly important in the analysis of drug-resistance and vaccine trials, this software will be useful to the pathogen research community. A zip file containing the source code and jar file is freely available for download from http://www.bioinf.manchester.ac.uk/segminator/.

摘要

背景

下一代测序技术为病毒群体中存在的变异提供了详细的见解，为反应性和预测性治疗策略提供了可能性。然而，当前的软件工具需要扩展以适应通常与时间或空间相关的高深度病毒数据集。此外，由于新型测序平台和化学物质的发展，每种平台和化学物质都有隐含的优势和劣势，因此研究人员能够定期比较和组合来自不同平台/化学物质的数据将很有帮助。特别是，必须量化与特定测序过程相关的错误，以便识别真正的生物学变异。

结果

Segminator II 的开发是为了允许高效比较来自不同来源的数据集。我们通过比较在 454 生命科学和 Illumina 平台上测序的 12 个流感 H1N1 样本的大型数据集来演示其用途，从而量化了平台错误。对于不匹配，中位数错误率分别为 0.10%和 0.12%，表明两个平台的性能相似。对于插入和缺失，中位数错误率在 454 数据中（分别为 0.3%和 0.2%）明显高于在 Illumina 数据中（分别为 0.004%和 0.006%）。与先前的观察结果一致，这些更高的速率与 454 平台上的同源多聚体延伸强烈相关。在这些区域之外，两个平台的插入缺失错误分布相似。此外，我们将我们的软件应用于低频变体的识别。

结论

我们使用 Segminator II 证明，使用来自两个不同平台的数据，可以从生物变异中区分平台特定的错误。我们已经使用这种方法来量化 454 和 Illumina 平台中与基因组位置以及读取位置相关的错误量。鉴于下一代数据在药物耐药性和疫苗试验分析中越来越重要，该软件将对病原体研究界有用。可从 http://www.bioinf.manchester.ac.uk/segminator/ 免费下载包含源代码和 jar 文件的 zip 文件。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/64e4/3359224/a71020086805/1471-2105-13-47-1.jpg

相似文献

Analysis of high-depth sequence data for studying viral diversity: a comparison of next generation sequencing platforms using Segminator II.

BMC Bioinformatics. 2012 Mar 23;13:47. doi: 10.1186/1471-2105-13-47.

Comparison of three next-generation sequencing platforms for metagenomic sequencing and identification of pathogens in blood.

BMC Genomics. 2014 Feb 4;15:96. doi: 10.1186/1471-2164-15-96.

Viral population analysis and minority-variant detection using short read next-generation sequencing.

Philos Trans R Soc Lond B Biol Sci. 2013 Feb 4;368(1614):20120205. doi: 10.1098/rstb.2012.0205. Print 2013 Mar 19.

Using high-throughput sequencing to leverage surveillance of genetic diversity and oseltamivir resistance: a pilot study during the 2009 influenza A(H1N1) pandemic.

PLoS One. 2013 Jul 2;8(7):e67010. doi: 10.1371/journal.pone.0067010. Print 2013.

Viral deep sequencing needs an adaptive approach: IRMA, the iterative refinement meta-assembler.

BMC Genomics. 2016 Sep 5;17(1):708. doi: 10.1186/s12864-016-3030-6.

Next-Generation Sequencing Analysis of the Within-Host Genetic Diversity of Influenza A(H1N1)pdm09 Viruses in the Upper and Lower Respiratory Tracts of Patients with Severe Influenza.

mSphere. 2021 Jan 6;6(1):e01043-20. doi: 10.1128/mSphere.01043-20.

mInDel: a high-throughput and efficient pipeline for genome-wide InDel marker development.

BMC Genomics. 2016 Apr 14;17:290. doi: 10.1186/s12864-016-2614-5.

Analysis of the genetic diversity of influenza A viruses using next-generation DNA sequencing.

BMC Genomics. 2015 Feb 14;16(1):79. doi: 10.1186/s12864-015-1284-z.

SInC: an accurate and fast error-model based simulator for SNPs, Indels and CNVs coupled with a read generator for short-read sequence data.

BMC Bioinformatics. 2014 Feb 5;15:40. doi: 10.1186/1471-2105-15-40.

QQ-SNV: single nucleotide variant detection at low frequency by comparing the quality quantiles.

BMC Bioinformatics. 2015 Nov 10;16:379. doi: 10.1186/s12859-015-0812-9.

引用本文的文献

Reduced and highly diverse peripheral HIV-1 reservoir in virally suppressed patients infected with non-B HIV-1 strains in Uganda.

Retrovirology. 2022 Jan 15;19(1):1. doi: 10.1186/s12977-022-00587-3.

Therapy with M2e-Specific IgG Selects for an Influenza A Virus Mutant with Delayed Matrix Protein 2 Expression.

mBio. 2021 Aug 31;12(4):e0074521. doi: 10.1128/mBio.00745-21. Epub 2021 Jul 13.

Human Mitochondrial Control Region and mtGenome: Design and Forensic Validation of NGS Multiplexes, Sequencing and Analytical Software.

Genes (Basel). 2021 Apr 19;12(4):599. doi: 10.3390/genes12040599.

Transmitted HIV-1 drug resistance in a large international cohort using next-generation sequencing: results from the Strategic Timing of Antiretroviral Treatment (START) study.

HIV Med. 2021 May;22(5):360-371. doi: 10.1111/hiv.13038. Epub 2020 Dec 25.

Nosocomial transmission of influenza: A retrospective cross-sectional study using next generation sequencing at a hospital in England (2012-2014).

Influenza Other Respir Viruses. 2019 Nov;13(6):556-563. doi: 10.1111/irv.12679. Epub 2019 Sep 19.

Measurements of intrahost viral diversity require an unbiased diversity metric.

Virus Evol. 2019 Jan 30;5(1):vey041. doi: 10.1093/ve/vey041. eCollection 2019 Jan.

Low-Bias RNA Sequencing of the HIV-2 Genome from Blood Plasma.

J Virol. 2018 Dec 10;93(1). doi: 10.1128/JVI.00677-18. Print 2019 Jan 1.

Refining the Results of a Classical SELEX Experiment by Expanding the Sequence Data Set of an Aptamer Pool Selected for Protein A.

Int J Mol Sci. 2018 Feb 24;19(2):642. doi: 10.3390/ijms19020642.

Characterization of Hepatitis C Virus (HCV) Envelope Diversification from Acute to Chronic Infection within a Sexually Transmitted HCV Cluster by Using Single-Molecule, Real-Time Sequencing.

J Virol. 2017 Feb 28;91(6). doi: 10.1128/JVI.02262-16. Print 2017 Mar 15.

T-RECs: rapid and large-scale detection of recombination events among different evolutionary lineages of viral genomes.

BMC Bioinformatics. 2017 Jan 5;18(1):13. doi: 10.1186/s12859-016-1420-z.

本文引用的文献

Accuracy and quality assessment of 454 GS-FLX Titanium pyrosequencing.

BMC Genomics. 2011 May 19;12:245. doi: 10.1186/1471-2164-12-245.

Sequence-specific error profile of Illumina sequencers.

Nucleic Acids Res. 2011 Jul;39(13):e90. doi: 10.1093/nar/gkr344. Epub 2011 May 16.

Evaluation of next-generation sequencing software in mapping and assembly.

J Hum Genet. 2011 Jun;56(6):406-14. doi: 10.1038/jhg.2011.43. Epub 2011 Apr 28.

Correcting errors in short reads by multiple alignments.

Bioinformatics. 2011 Jun 1;27(11):1455-61. doi: 10.1093/bioinformatics/btr170. Epub 2011 Apr 5.

A systems analysis of mutational effects in HIV-1 protease and reverse transcriptase.

Nat Genet. 2011 May;43(5):487-9. doi: 10.1038/ng.795. Epub 2011 Mar 27.

HIV-1 infected monozygotic twins: a tale of two outcomes.

BMC Evol Biol. 2011 Mar 8;11:62. doi: 10.1186/1471-2148-11-62.

Improving SNP discovery by base alignment quality.

Bioinformatics. 2011 Apr 15;27(8):1157-8. doi: 10.1093/bioinformatics/btr076. Epub 2011 Feb 13.

The evolutionary analysis of emerging low frequency HIV-1 CXCR4 using variants through time--an ultra-deep approach.

PLoS Comput Biol. 2010 Dec 16;6(12):e1001022. doi: 10.1371/journal.pcbi.1001022.

From RNA-seq reads to differential expression results.

Genome Biol. 2010;11(12):220. doi: 10.1186/gb-2010-11-12-220. Epub 2010 Dec 22.

Error correction of next-generation sequencing data and reliable estimation of HIV quasispecies.

Nucleic Acids Res. 2010 Nov;38(21):7400-9. doi: 10.1093/nar/gkq655. Epub 2010 Jul 29.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

利用 Segminator II 分析高通量测序数据以研究病毒多样性：下一代测序平台的比较。

Analysis of high-depth sequence data for studying viral diversity: a comparison of next generation sequencing platforms using Segminator II.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译