Suppr超能文献

BlobToolKit - 基因组组装的交互式质量评估。

BlobToolKit - Interactive Quality Assessment of Genome Assemblies.

机构信息

Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, UK

Wellcome Sanger Institute, Cambridge CB10 1SA, UK.

出版信息

G3 (Bethesda). 2020 Apr 9;10(4):1361-1374. doi: 10.1534/g3.119.400908.

Abstract

Reconstruction of target genomes from sequence data produced by instruments that are agnostic as to the species-of-origin may be confounded by contaminant DNA. Whether introduced during sample processing or through co-extraction alongside the target DNA, if insufficient care is taken during the assembly process, the final assembled genome may be a mixture of data from several species. Such assemblies can confound sequence-based biological inference and, when deposited in public databases, may be included in downstream analyses by users unaware of underlying problems. We present BlobToolKit, a software suite to aid researchers in identifying and isolating non-target data in draft and publicly available genome assemblies. BlobToolKit can be used to process assembly, read and analysis files for fully reproducible interactive exploration in the browser-based Viewer. BlobToolKit can be used during assembly to filter non-target DNA, helping researchers produce assemblies with high biological credibility. We have been running an automated BlobToolKit pipeline on eukaryotic assemblies publicly available in the International Nucleotide Sequence Data Collaboration and are making the results available through a public instance of the Viewer at https://blobtoolkit.genomehubs.org/view We aim to complete analysis of all publicly available genomes and then maintain currency with the flow of new genomes. We have worked to embed these views into the presentation of genome assemblies at the European Nucleotide Archive, providing an indication of assembly quality alongside the public record with links out to allow full exploration in the Viewer.

摘要

从对起源物种一无所知的仪器产生的序列数据中重建目标基因组,可能会受到污染 DNA 的干扰。如果在组装过程中没有足够的注意,无论是在样品处理过程中引入的,还是与目标 DNA 一起提取的,最终组装的基因组可能是来自几个物种的数据的混合物。这样的组装会混淆基于序列的生物推断,并且当存储在公共数据库中时,可能会被不知道潜在问题的用户包含在下游分析中。我们介绍了 BlobToolKit,这是一个软件套件,可帮助研究人员识别和隔离草稿和公开可用基因组组装中的非目标数据。BlobToolKit 可用于处理组装、读取和分析文件,以便在基于浏览器的查看器中进行完全可重复的交互式探索。BlobToolKit 可在组装过程中用于过滤非目标 DNA,帮助研究人员生成具有高生物学可信度的组装。我们一直在国际核苷酸序列数据协作中公开的真核生物组装上运行自动 BlobToolKit 管道,并通过 https://blobtoolkit.genomehubs.org/view 上的公共查看器实例提供结果。我们的目标是完成所有公开可用基因组的分析,然后保持与新基因组的流动同步。我们致力于将这些视图嵌入到欧洲核苷酸档案库中基因组组装的呈现中,除了公共记录外,还提供组装质量的指示,并提供链接以允许在查看器中进行全面探索。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验