• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

QuasiSeq:通过使用 PacBio 长测序读段进行自调谐谱聚类来分析病毒准种。

QuasiSeq: profiling viral quasispecies via self-tuning spectral clustering with PacBio long sequencing reads.

机构信息

Laboratory of Human Retrovirology and Immunoinformatics, Frederick National Laboratory for Cancer Research, Frederick, MD 21702, USA.

Laboratory of Immunoregulation, National Institute of Allergy and Infectious Diseases, Bethesda, MD 20892, USA.

出版信息

Bioinformatics. 2022 Jun 13;38(12):3192-3199. doi: 10.1093/bioinformatics/btac313.

DOI:10.1093/bioinformatics/btac313
PMID:35532087
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9890302/
Abstract

MOTIVATION

The existence of quasispecies in the viral population causes difficulties for disease prevention and treatment. High-throughput sequencing provides opportunity to determine rare quasispecies and long sequencing reads covering full genomes reduce quasispecies determination to a clustering problem. The challenge is high similarity of quasispecies and high error rate of long sequencing reads.

RESULTS

We developed QuasiSeq using a novel signature-based self-tuning clustering method, SigClust, to profile viral mixtures with high accuracy and sensitivity. QuasiSeq can correctly identify quasispecies even using low-quality sequencing reads (accuracy <80%) and produce quasispecies sequences with high accuracy (≥99.55%). Using high-quality circular consensus sequencing reads, QuasiSeq can produce quasispecies sequences with 100% accuracy. QuasiSeq has higher sensitivity and specificity than similar published software. Moreover, the requirement of the computational resource can be controlled by the size of the signature, which makes it possible to handle big sequencing data for rare quasispecies discovery. Furthermore, parallel computation is implemented to process the clusters and further reduce the runtime. Finally, we developed a web interface for the QuasiSeq workflow with simple parameter settings based on the quality of sequencing data, making it easy to use for users without advanced data science skills.

AVAILABILITY AND IMPLEMENTATION

QuasiSeq is open source and freely available at https://github.com/LHRI-Bioinformatics/QuasiSeq. The current release (v1.0.0) is archived and available at https://zenodo.org/badge/latestdoi/340494542.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

病毒群体中的准种存在给疾病的预防和治疗带来了困难。高通量测序为确定稀有准种提供了机会,而覆盖全基因组的长测序读长将准种确定减少到聚类问题。挑战在于准种的高度相似性和长测序读长的高错误率。

结果

我们使用了一种新颖的基于签名的自调整聚类方法 SigClust 来开发 QuasiSeq,以高精度和高灵敏度来描绘病毒混合物。QuasiSeq 甚至可以在低质量测序读长(准确率 <80%)的情况下正确识别准种,并生成具有高精度(≥99.55%)的准种序列。使用高质量的环形一致测序读长,QuasiSeq 可以生成准确率为 100%的准种序列。QuasiSeq 比类似的已发表软件具有更高的灵敏度和特异性。此外,计算资源的要求可以通过签名的大小来控制,这使得处理稀有准种发现的大数据量成为可能。此外,还实现了并行计算来处理聚类,进一步缩短了运行时间。最后,我们开发了一个基于测序数据质量的简单参数设置的 QuasiSeq 工作流程的网络界面,使没有高级数据科学技能的用户也易于使用。

可用性和实现

QuasiSeq 是开源的,可以在 https://github.com/LHRI-Bioinformatics/QuasiSeq 上免费获得。当前版本(v1.0.0)已存档并可在 https://zenodo.org/badge/latestdoi/340494542 上获得。

补充信息

补充数据可在 Bioinformatics 在线获取。

相似文献

1
QuasiSeq: profiling viral quasispecies via self-tuning spectral clustering with PacBio long sequencing reads.QuasiSeq:通过使用 PacBio 长测序读段进行自调谐谱聚类来分析病毒准种。
Bioinformatics. 2022 Jun 13;38(12):3192-3199. doi: 10.1093/bioinformatics/btac313.
2
Viral quasispecies reconstruction via tensor factorization with successive read removal.基于连续读段去除的张量分解进行病毒准种重建。
Bioinformatics. 2018 Jul 1;34(13):i23-i31. doi: 10.1093/bioinformatics/bty291.
3
Inference of viral quasispecies with a paired de Bruijn graph.基于配对 de Bruijn 图的病毒准种推断。
Bioinformatics. 2021 May 1;37(4):473-481. doi: 10.1093/bioinformatics/btaa782.
4
De novo haplotype reconstruction in viral quasispecies using paired-end read guided path finding.利用配对末端读指导路径寻找技术对病毒准种进行从头单倍型重建。
Bioinformatics. 2018 Sep 1;34(17):2927-2935. doi: 10.1093/bioinformatics/bty202.
5
Alignment-free clustering of UMI tagged DNA molecules.无比对聚类分析 UMI 标签化 DNA 分子。
Bioinformatics. 2019 Jun 1;35(11):1829-1836. doi: 10.1093/bioinformatics/bty888.
6
OGRE: Overlap Graph-based metagenomic Read clustEring.OGRE:基于重叠图的宏基因组读聚类。
Bioinformatics. 2021 May 17;37(7):905-912. doi: 10.1093/bioinformatics/btaa760.
7
Reconstructing viral quasispecies from NGS amplicon reads.从二代测序扩增子读数中重建病毒准种
In Silico Biol. 2011;11(5-6):237-49. doi: 10.3233/ISB-2012-0458.
8
Inferring viral quasispecies spectra from 454 pyrosequencing reads.从 454 焦磷酸测序读取中推断病毒准种谱。
BMC Bioinformatics. 2011;12 Suppl 6(Suppl 6):S1. doi: 10.1186/1471-2105-12-S6-S1. Epub 2011 Jul 28.
9
Streamlined Subpopulation, Subtype, and Recombination Analysis of HIV-1 Half-Genome Sequences Generated by High-Throughput Sequencing.高通量测序生成的 HIV-1 半基因组序列的简化亚群、亚型和重组分析。
mSphere. 2020 Oct 14;5(5):e00551-20. doi: 10.1128/mSphere.00551-20.
10
PacRAT: a program to improve barcode-variant mapping from PacBio long reads using multiple sequence alignment.PacRAT:一种利用多重序列比对提高 PacBio 长读段中条码变异映射的程序。
Bioinformatics. 2022 May 13;38(10):2927-2929. doi: 10.1093/bioinformatics/btac165.

引用本文的文献

1
ClusterV-Web: a user-friendly tool for profiling HIV quasispecies and generating drug resistance reports from nanopore long-read data.ClusterV-Web:一种用户友好的工具,用于分析HIV准种并根据纳米孔长读长数据生成耐药性报告。
Bioinform Adv. 2024 Jan 13;4(1):vbae006. doi: 10.1093/bioadv/vbae006. eCollection 2024.

本文引用的文献

1
Accurate assembly of minority viral haplotypes from next-generation sequencing through efficient noise reduction.通过有效降低噪声,实现下一代测序中少数病毒单倍型的精确组装。
Nucleic Acids Res. 2021 Sep 27;49(17):e102. doi: 10.1093/nar/gkab576.
2
Benchmarking of long-read correction methods.长读长校正方法的基准测试。
NAR Genom Bioinform. 2020 May 25;2(2):lqaa037. doi: 10.1093/nargab/lqaa037. eCollection 2020 Jun.
3
Opportunities and challenges in long-read sequencing data analysis.长读测序数据分析中的机遇与挑战。
Genome Biol. 2020 Feb 7;21(1):30. doi: 10.1186/s13059-020-1935-5.
4
Viral fitness: history and relevance for viral pathogenesis and antiviral interventions.病毒适应性:历史及其与病毒发病机制和抗病毒干预的相关性。
Pathog Dis. 2019 Mar 1;77(2). doi: 10.1093/femspd/ftz021.
5
Application of deep sequencing methods for inferring viral population diversity.深度测序方法在推断病毒群体多样性中的应用。
J Virol Methods. 2019 Apr;266:95-102. doi: 10.1016/j.jviromet.2019.01.013. Epub 2019 Jan 25.
6
Towards Personalized Medicine: An Improved Assembly Procedure for Early Detection of Drug Resistant HIV Minor Quasispecies in Patient Samples.迈向个性化医疗:一种改进的组装程序,用于早期检测患者样本中耐药性HIV准种
Bioinformation. 2018 Sep 18;14(8):449-454. doi: 10.6026/97320630014449. eCollection 2018.
7
De novo haplotype reconstruction in viral quasispecies using paired-end read guided path finding.利用配对末端读指导路径寻找技术对病毒准种进行从头单倍型重建。
Bioinformatics. 2018 Sep 1;34(17):2927-2935. doi: 10.1093/bioinformatics/bty202.
8
aBayesQR: A Bayesian Method for Reconstruction of Viral Populations Characterized by Low Diversity.aBayesQR:一种用于重建低多样性特征病毒群体的贝叶斯方法。
J Comput Biol. 2018 Jul;25(7):637-648. doi: 10.1089/cmb.2017.0249. Epub 2018 Feb 26.
9
Long Single-Molecule Reads Can Resolve the Complexity of the Influenza Virus Composed of Rare, Closely Related Mutant Variants.长单分子读数能够解析由罕见、密切相关的突变变体组成的流感病毒的复杂性。
J Comput Biol. 2017 Jun;24(6):558-570. doi: 10.1089/cmb.2016.0146. Epub 2016 Nov 30.
10
Recent advances in inferring viral diversity from high-throughput sequencing data.高通量测序数据推断病毒多样性的最新进展。
Virus Res. 2017 Jul 15;239:17-32. doi: 10.1016/j.virusres.2016.09.016. Epub 2016 Sep 28.