Suppr超能文献

星云:超高效免图结构变异基因分型器。

Nebula: ultra-efficient mapping-free structural variant genotyper.

机构信息

Genome Center, UC Davis, Davis, California, 95616, USA.

UC Davis MIND Institute, Sacramento, California, 95817, USA.

出版信息

Nucleic Acids Res. 2021 May 7;49(8):e47. doi: 10.1093/nar/gkab025.

Abstract

Large scale catalogs of common genetic variants (including indels and structural variants) are being created using data from second and third generation whole-genome sequencing technologies. However, the genotyping of these variants in newly sequenced samples is a nontrivial task that requires extensive computational resources. Furthermore, current approaches are mostly limited to only specific types of variants and are generally prone to various errors and ambiguities when genotyping complex events. We are proposing an ultra-efficient approach for genotyping any type of structural variation that is not limited by the shortcomings and complexities of current mapping-based approaches. Our method Nebula utilizes the changes in the count of k-mers to predict the genotype of structural variants. We have shown that not only Nebula is an order of magnitude faster than mapping based approaches for genotyping structural variants, but also has comparable accuracy to state-of-the-art approaches. Furthermore, Nebula is a generic framework not limited to any specific type of event. Nebula is publicly available at https://github.com/Parsoa/Nebula.

摘要

大型常见遗传变异(包括插入和结构变异)目录正在使用第二代和第三代全基因组测序技术的数据创建。然而,对新测序样本中这些变异的基因分型是一项艰巨的任务,需要大量的计算资源。此外,目前的方法大多仅限于特定类型的变异,并且在对复杂事件进行基因分型时通常容易出现各种错误和歧义。我们提出了一种超高效的方法,用于对任何类型的结构变异进行基因分型,这种方法不受当前基于映射方法的缺点和复杂性的限制。我们的方法 Nebula 利用 k-mer 计数的变化来预测结构变异的基因型。我们已经表明,Nebula 不仅在基因分型结构变异方面比基于映射的方法快一个数量级,而且与最先进的方法具有可比的准确性。此外,Nebula 是一个通用框架,不限于任何特定类型的事件。Nebula 可在 https://github.com/Parsoa/Nebula 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76ee/8096284/1bdadb1730ca/gkab025fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验