Suppr超能文献

WgLink:使用L0 + L1正则化重建全基因组病毒单倍型。

WgLink: reconstructing whole-genome viral haplotypes using L0+L1-regularization.

作者信息

Cao Chen, Greenberg Matthew, Long Quan

机构信息

Department of Biochemistry and Molecular Biology, Alberta Children's Hospital Research Institute, Calgary, AB, T2N 4N1 Canada.

Department of Mathematics and Statistics.

出版信息

Bioinformatics. 2021 Sep 9;37(17):2744-2746. doi: 10.1093/bioinformatics/btab076.

Abstract

SUMMARY

Many tools can reconstruct viral sequences based on next-generation sequencing reads. Although existing tools effectively recover local regions, their accuracy suffers when reconstructing the whole viral genomes (strains). Moreover, they consume significant memory when the sequencing coverage is high or when the genome size is large. We present WgLink to meet this challenge. WgLink takes local reconstructions produced by other tools as input and patches the resulting segments together into coherent whole-genome strains. We accomplish this using an L0+L1-regularized regression, synthesizing variant allele frequency data with physical linkage between multiple variants spanning multiple regions simultaneously. WgLink achieves higher accuracy than existing tools both on simulated and on real datasets while using significantly less memory (RAM) and fewer CPU hours.

AVAILABILITY AND IMPLEMENTATION

Source code and binaries are freely available at https://github.com/theLongLab/wglink.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

摘要

许多工具可以基于下一代测序读数重建病毒序列。尽管现有工具能有效地恢复局部区域,但在重建整个病毒基因组(毒株)时其准确性会受到影响。此外,当测序覆盖度高或基因组规模大时,它们会消耗大量内存。我们提出了WgLink来应对这一挑战。WgLink将其他工具产生的局部重建结果作为输入,并将所得片段拼接成连贯的全基因组毒株。我们通过L0 + L1正则化回归来实现这一点,同时综合多个跨越多个区域的变异体之间的物理连锁的变异等位基因频率数据。在模拟数据集和真实数据集上,WgLink都比现有工具具有更高的准确性,同时使用显著更少的内存(随机存取存储器)和更少的中央处理器运行时间。

可用性和实现方式

源代码和二进制文件可在https://github.com/theLongLab/wglink上免费获取。

补充信息

补充数据可在《生物信息学》在线版获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验