Suppr超能文献

利用迭代细化技术,从短读长序列中准确重建人类细胞中的病毒基因组。

Accurate reconstruction of viral genomes in human cells from short reads using iterative refinement.

机构信息

Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong.

Department of Anatomical and Cellular Pathology, The Chinese University of Hong Kong, Shatin, Hong Kong.

出版信息

BMC Genomics. 2022 Jun 6;23(1):422. doi: 10.1186/s12864-022-08649-8.

Abstract

BACKGROUND

After an infection, human cells may contain viral genomes in the form of episomes or integrated DNA. Comparing the genomic sequences of different strains of a virus in human cells can often provide useful insights into its behaviour, activity and pathology, and may help develop methods for disease prevention and treatment. To support such comparative analyses, the viral genomes need to be accurately reconstructed from a large number of samples. Previous efforts either rely on customized experimental protocols or require high similarity between the sequenced genomes and a reference, both of which limit the general applicability of these approaches. In this study, we propose a pipeline, named ASPIRE, for reconstructing viral genomes accurately from short reads data of human samples, which are increasingly available from genome projects and personal genomics. ASPIRE contains a basic part that involves de novo assembly, tiling and gap filling, and additional components for iterative refinement, sequence corrections and wrapping.

RESULTS

Evaluated by the alignment quality of sequencing reads to the reconstructed genomes, these additional components improve the assembly quality in general, and in some particular samples quite substantially, especially when the sequenced genome is significantly different from the reference. We use ASPIRE to reconstruct the genomes of Epstein Barr Virus (EBV) from the whole-genome sequencing data of 61 nasopharyngeal carcinoma (NPC) samples and provide these sequences as a resource for EBV research.

CONCLUSIONS

ASPIRE improves the quality of the reconstructed EBV genomes in published studies and outperforms TRACESPipe in some samples considered.

摘要

背景

感染后,人类细胞可能以附加体或整合 DNA 的形式含有病毒基因组。比较人类细胞中不同病毒株的基因组序列通常可以深入了解其行为、活性和病理学,并可能有助于开发疾病预防和治疗方法。为了支持这种比较分析,需要从大量样本中准确重建病毒基因组。以前的研究工作要么依赖于定制的实验方案,要么需要测序基因组与参考基因组之间具有高度相似性,这两种方法都限制了这些方法的广泛适用性。在这项研究中,我们提出了一种名为 ASPIRE 的方法,用于从人类样本的短读数据中准确重建病毒基因组,这些数据越来越多地来自基因组项目和个人基因组学。ASPIRE 包含一个基本部分,涉及从头组装、平铺和间隙填充,以及用于迭代细化、序列校正和封装的附加组件。

结果

通过将测序reads 与重建基因组的对齐质量进行评估,这些附加组件通常会提高组装质量,在某些特定样本中,尤其是当测序基因组与参考基因组有很大差异时,组装质量会有显著提高。我们使用 ASPIRE 从 61 个鼻咽癌 (NPC) 样本的全基因组测序数据中重建 EBV 基因组,并将这些序列作为 EBV 研究的资源提供。

结论

ASPIRE 提高了已发表研究中重建 EBV 基因组的质量,在某些被认为的样本中优于 TRACESPipe。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02e2/9169298/300291b71623/12864_2022_8649_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验