Suppr超能文献

从头序列组装需要对嵌合序列进行生物信息学检查。

De novo sequence assembly requires bioinformatic checking of chimeric sequences.

机构信息

Division of Pathology, Department of Laboratory Medicine, Karolinska Institutet, Stockholm, Sweden.

出版信息

PLoS One. 2020 Aug 10;15(8):e0237455. doi: 10.1371/journal.pone.0237455. eCollection 2020.

Abstract

De novo assembly of sequence reads from next generation sequencing platforms is a common strategy for detecting presence and sequencing of viruses in biospecimens. Amplification artifacts and presence of several related viruses in the same specimen can lead to assembly of erroneous, chimeric sequences. We now report that such chimeras can also occur between viral and non-viral biological sequences incorrectly joined together which may cause erroneous detection of viruses, highlighting the importance of performing a chimera checking step in bioinformatics pipelines. Using Illumina NextSeq and metagenomic sequencing, we analyzed 80 consecutive non-melanoma skin cancers (NMSCs) from 11 immunosuppressed patients together with 11 NMSCs from patients who had only developed 1 NMSC. We aligned high-quality reads against a Human Papillomavirus (HPV) database and found HPV sequences in 9/91 specimens. A previous bioinformatic analysis of the same crude sequencing data from some of these samples had found an additional 3 specimens to be HPV-positive after performing de novo assembly. The reason for the discrepancy was investigated and found to be mostly caused by chimeric sequences containing both viral and non-viral sequences. Non-viral sequences were present in these 3 samples. To avoid erroneous detection of HPV when performing sequencing, we thus developed a novel script to identify HPV chimeric sequences.

摘要

从下一代测序平台的序列读取进行从头组装是检测生物标本中病毒存在和测序的常用策略。扩增伪影和同一标本中存在几种相关病毒会导致错误的嵌合序列的组装。我们现在报告,这种嵌合体也可能发生在错误连接的病毒和非病毒生物序列之间,这可能导致病毒的错误检测,突出了在生物信息学管道中执行嵌合体检查步骤的重要性。使用 Illumina NextSeq 和宏基因组测序,我们分析了来自 11 名免疫抑制患者的 80 例连续非黑色素瘤皮肤癌(NMSC),以及仅发生 1 例 NMSC 的患者的 11 例 NMSC。我们将高质量的读数与 HPV 数据库进行比对,在 9/91 个标本中发现了 HPV 序列。对来自这些样本的部分粗测序数据的先前生物信息学分析发现,在进行从头组装后,另外 3 个样本呈 HPV 阳性。对差异的原因进行了调查,发现主要是由含有病毒和非病毒序列的嵌合序列引起的。这些 3 个样本中存在非病毒序列。为了避免在进行测序时错误检测 HPV,我们因此开发了一种新的脚本来识别 HPV 嵌合序列。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3aa6/7417191/693f5e4feca2/pone.0237455.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验