Suppr超能文献

使用gVolante评估基因组组装和基因模型

Evaluating Genome Assemblies and Gene Models Using gVolante.

作者信息

Nishimura Osamu, Hara Yuichiro, Kuraku Shigehiro

机构信息

Laboratory for Phyloinformatics, RIKEN Center for Biosystems Dynamics Research (BDR), Kobe, Japan.

出版信息

Methods Mol Biol. 2019;1962:247-256. doi: 10.1007/978-1-4939-9173-0_15.

Abstract

In daily practice of de novo genome assembly and gene prediction, it would be a natural urge to evaluate their products. Different programs and parameter settings give rise to variable outputs, which leaves a decision of which output to adopt for downstream analysis for addressing biological questions. Instead of superficial assessment of length-based statistics of output sequences (e.g., N50 scaffold length), completeness assessment by means of scoring the coverage of reference orthologs has been increasingly utilized.We previously launched a web service, gVolante ( https://gvolante.riken.jp /), to provide a user-friendly interface and a uniform environment for completeness assessment with the pipelines CEGMA and BUSCO. Completeness assessments performed on gVolante report scores based on not just the coverage of reference genes but also on sequence lengths, allowing quality control in multiple aspects. This chapter focuses on the procedure for such assessment and provides technical tips for higher accuracy.

摘要

在从头基因组组装和基因预测的日常实践中,评估其产物是一种自然而然的需求。不同的程序和参数设置会产生可变的输出结果,这就需要决定采用哪种输出结果用于下游分析以解决生物学问题。以往基于输出序列的长度统计(如N50支架长度)进行的表面评估已逐渐被淘汰,通过对参考直系同源基因的覆盖度进行评分来评估完整性的方法越来越受到青睐。我们之前推出了一个网络服务gVolante(https://gvolante.riken.jp/),为使用CEGMA和BUSCO管道进行完整性评估提供了一个用户友好的界面和统一的环境。在gVolante上进行的完整性评估不仅会根据参考基因的覆盖度报告分数,还会根据序列长度报告分数,从而实现多方面的质量控制。本章重点介绍这种评估的流程,并提供提高准确性的技术提示。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验