• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于研究病毒检测管道稳健性的重采样策略。

A resampling strategy for studying robustness in virus detection pipelines.

机构信息

Institute for Animal Breeding and Genetics, University of Veterinary Medicine Hannover, Foundation, Bünteweg 17p, 30559 Hannover, Germany.

Institute for Virology and Immunobiology, University of Würzburg, Versbacher Straße 7, 97078 Würzburg, Germany.

出版信息

Comput Biol Chem. 2021 Oct;94:107555. doi: 10.1016/j.compbiolchem.2021.107555. Epub 2021 Aug 2.

DOI:10.1016/j.compbiolchem.2021.107555
PMID:34364046
Abstract

Next-generation sequencing is regularly used to identify viral sequences in DNA or RNA samples of infected hosts. A major step of most pipelines for virus detection is to map sequence reads against known virus genomes. Due to small differences between the sequences of related viruses, and due to several biological or technical errors, mapping underlies uncertainties. As a consequence, the resulting list of detected viruses can lack robustness. A new approach for generating artificial sequencing reads together with a strategy of resampling from the original findings is proposed that can help to assess the robustness of the originally identified list of viruses. From the original mapping result in form of a SAM file, a set of statistical distributions are derived. These are used in the resampling pipeline to generate new artificial reads which are again mapped versus the reference genomes. By summarizing the resampling procedure, the analyst receives information about whether the presence of a particular virus in the sample gains or losses evidence, and thus about the robustness of the original mapping list but also that of individual viruses in this list. To judge robustness, several indicators are derived from the resampling procedure such as the correlation between original and resampling read counts, or the statistical detection of outliers in the differences of read counts. Additionally, graphical illustrations of read count shifts via Sankey diagrams are provided. To demonstrate the use of the new approach, the resampling approach is applied to three real-world data samples, one of them with laboratory-confirmed Influenza sequences, and to artificially generated data where virus sequences have been spiked into the sequencing data of a host. By applying the resampling pipeline, several viruses drop from the original list while new viruses emerge, showing robustness of those viruses that remain in the list. The evaluation of the new approach shows that the resampling approach is helpful to analyze the viral content of a biological sample, to rate the robustness of original findings and to better show the overall distribution of findings. The method is also applicable to other virus detection pipelines based on read mapping.

摘要

下一代测序技术常用于鉴定感染宿主的 DNA 或 RNA 样本中的病毒序列。大多数病毒检测管道的主要步骤是将序列读取与已知病毒基因组进行比对。由于相关病毒序列之间存在微小差异,并且存在多种生物学或技术误差,因此映射存在不确定性。因此,检测到的病毒列表可能缺乏稳健性。本文提出了一种新的方法,用于生成人工测序reads,并从原始发现中进行重新采样,以帮助评估最初识别的病毒列表的稳健性。从原始映射结果(SAM 文件)中,得出了一组统计分布。这些分布用于重新采样管道中,以生成新的人工reads,然后再次将其与参考基因组进行比对。通过总结重新采样过程,分析人员可以获得有关样本中特定病毒的存在是否获得或失去证据的信息,从而获得原始映射列表以及该列表中各个病毒的稳健性信息。为了判断稳健性,从重新采样过程中得出了几个指标,例如原始和重新采样读取计数之间的相关性,或者在读取计数差异中统计检测到异常值。此外,还提供了通过 Sankey 图显示读取计数变化的图形说明。为了演示新方法的使用,将重新采样方法应用于三个真实世界的数据样本,其中一个样本包含实验室确认的流感序列,以及人为生成的病毒序列被添加到宿主测序数据中的数据。通过应用重新采样管道,一些病毒从原始列表中消失,而新病毒出现,表明列表中保留的病毒具有稳健性。新方法的评估表明,重新采样方法有助于分析生物样本中的病毒含量,评估原始发现的稳健性,并更好地显示总体发现分布。该方法也适用于其他基于读取映射的病毒检测管道。

相似文献

1
A resampling strategy for studying robustness in virus detection pipelines.一种用于研究病毒检测管道稳健性的重采样策略。
Comput Biol Chem. 2021 Oct;94:107555. doi: 10.1016/j.compbiolchem.2021.107555. Epub 2021 Aug 2.
2
Measuring reproducibility of virus metagenomics analyses using bootstrap samples from FASTQ-files.使用 FASTQ 文件中的自举样本测量病毒宏基因组分析的可重复性。
Bioinformatics. 2021 May 23;37(8):1068-1075. doi: 10.1093/bioinformatics/btaa926.
3
Virus detection in high-throughput sequencing data without a reference genome of the host.在没有宿主参考基因组的高通量测序数据中进行病毒检测。
Infect Genet Evol. 2018 Dec;66:180-187. doi: 10.1016/j.meegid.2018.09.026. Epub 2018 Oct 3.
4
Sequencing Framework for the Sensitive Detection and Precise Mapping of Defective Interfering Particle-Associated Deletions across Influenza A and B Viruses.流感 A 型和 B 型病毒中缺陷干扰颗粒相关缺失的灵敏检测和精确作图测序框架。
J Virol. 2019 May 15;93(11). doi: 10.1128/JVI.00354-19. Print 2019 Jun 1.
5
Correcting the Estimation of Viral Taxa Distributions in Next-Generation Sequencing Data after Applying Artificial Neural Networks.应用人工神经网络后校正下一代测序数据中病毒分类群分布的估计。
Genes (Basel). 2021 Oct 31;12(11):1755. doi: 10.3390/genes12111755.
6
VirION2: a short- and long-read sequencing and informatics workflow to study the genomic diversity of viruses in nature.VirION2:一种用于研究自然界中病毒基因组多样性的短读长和长读长测序及信息学工作流程。
PeerJ. 2021 Mar 30;9:e11088. doi: 10.7717/peerj.11088. eCollection 2021.
7
Primer ID Validates Template Sampling Depth and Greatly Reduces the Error Rate of Next-Generation Sequencing of HIV-1 Genomic RNA Populations.引物ID验证模板采样深度并大幅降低HIV-1基因组RNA群体下一代测序的错误率。
J Virol. 2015 Aug;89(16):8540-55. doi: 10.1128/JVI.00522-15. Epub 2015 Jun 3.
8
Analysis of the genetic diversity of influenza A viruses using next-generation DNA sequencing.使用下一代DNA测序技术分析甲型流感病毒的遗传多样性。
BMC Genomics. 2015 Feb 14;16(1):79. doi: 10.1186/s12864-015-1284-z.
9
Influenza classification from short reads with VAPOR facilitates robust mapping pipelines and zoonotic strain detection for routine surveillance applications.利用 VAPOR 对短读序列进行流感分类,有助于为常规监测应用程序构建稳健的映射管道和检测人畜共患病株。
Bioinformatics. 2020 Mar 1;36(6):1681-1688. doi: 10.1093/bioinformatics/btz814.
10
Enhanced bioinformatic profiling of VIDISCA libraries for virus detection and discovery.增强 VIDISCA 文库的生物信息学分析,以用于病毒检测和发现。
Virus Res. 2019 Apr 2;263:21-26. doi: 10.1016/j.virusres.2018.12.010. Epub 2018 Dec 19.