Suppr超能文献

无参考的下一代测序数据质量控制。

Quality control of next-generation sequencing data without a reference.

机构信息

Edinburgh Genomics, Ashworth Laboratories, University of Edinburgh Edinburgh, UK.

Edinburgh Genomics, Ashworth Laboratories, University of Edinburgh Edinburgh, UK ; Institute of Evolutionary Biology, Ashworth Laboratories, University of Edinburgh Edinburgh, UK.

出版信息

Front Genet. 2014 May 6;5:111. doi: 10.3389/fgene.2014.00111. eCollection 2014.

Abstract

Next-generation sequencing (NGS) technologies have dramatically expanded the breadth of genomics. Genome-scale data, once restricted to a small number of biomedical model organisms, can now be generated for virtually any species at remarkable speed and low cost. Yet non-model organisms often lack a suitable reference to map sequence reads against, making alignment-based quality control (QC) of NGS data more challenging than cases where a well-assembled genome is already available. Here we show that by generating a rapid, non-optimized draft assembly of raw reads, it is possible to obtain reliable and informative QC metrics, thus removing the need for a high quality reference. We use benchmark datasets generated from control samples across a range of genome sizes to illustrate that QC inferences made using draft assemblies are broadly equivalent to those made using a well-established reference, and describe QC tools routinely used in our production facility to assess the quality of NGS data from non-model organisms.

摘要

下一代测序 (NGS) 技术极大地扩展了基因组学的广度。基因组规模的数据,曾经仅限于少数生物医学模式生物,现在可以以惊人的速度和低成本为几乎任何物种生成。然而,非模式生物通常缺乏合适的参考来映射序列读数,这使得基于对齐的 NGS 数据质量控制 (QC) 比已经有良好组装基因组的情况更具挑战性。在这里,我们展示了通过生成快速、非优化的原始读数草案组装,可以获得可靠和有信息的 QC 指标,从而无需高质量的参考。我们使用来自一系列基因组大小的对照样本生成的基准数据集来说明,使用草案组装进行的 QC 推断与使用成熟参考进行的推断大致相当,并描述了我们在生产设施中常规使用的 QC 工具,以评估非模式生物的 NGS 数据的质量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b35d/4018527/55cc7e68bcf6/fgene-05-00111-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验