Suppr超能文献

基于 MinHash 的未校正距离向用于系统发育推断的恰当进化距离的转化。

On the transformation of MinHash-based uncorrected distances into proper evolutionary distances for phylogenetic inference.

机构信息

Hub de Bioinformatique et Biostatistique - Département Biologie Computationnelle, Institut Pasteur, USR 3756, CNRS, 75015 Paris, France.

出版信息

F1000Res. 2020 Nov 10;9:1309. doi: 10.12688/f1000research.26930.1. eCollection 2020.

Abstract

Recently developed MinHash-based techniques were proven successful in quickly estimating the level of similarity between large nucleotide sequences. This article discusses their usage and limitations in practice to approximating uncorrected distances between genomes, and transforming these pairwise dissimilarities into proper evolutionary distances. It is notably shown that complex distance measures can be easily approximated using simple transformation formulae based on few parameters. MinHash-based techniques can therefore be very useful for implementing fast yet accurate alignment-free phylogenetic reconstruction procedures from large sets of genomes. This last point of view is assessed with a simulation study using a dedicated bioinformatics tool.

摘要

最近开发的基于 MinHash 的技术已被证明可成功快速估计大型核苷酸序列之间的相似性水平。本文讨论了它们在实践中的用途和局限性,以近似基因组之间未经校正的距离,并将这些成对的不相似性转化为适当的进化距离。值得注意的是,可以使用基于少数参数的简单变换公式轻松近似复杂的距离度量。因此,基于 MinHash 的技术对于从大型基因组集中实现快速而准确的无比对系统发育重建程序非常有用。最后,使用专门的生物信息学工具进行模拟研究来评估这一观点。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2cb7/7713896/29f655d98d70/f1000research-9-29746-g0000.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验