Vieira Mourato Beatriz, Tsers Ivan, Denker Svenja, Klötzl Fabian, Haubold Bernhard
Research Group Bioinformatics, Max-Planck-Institute for Evolutionary Biology, 24306 Plön, Schleswig-Holstein, Germany.
Universität zu Lübeck, Lübeck, Schleswig-Holstein, Germany.
Bioinform Adv. 2024 Jul 27;4(1):vbae113. doi: 10.1093/bioadv/vbae113. eCollection 2024.
Markers for diagnostic polymerase chain reactions are routinely constructed by taking regions common to the genomes of a target organism and subtracting the regions found in the targets' closest relatives, their neighbors. This approach is implemented in the published package Fur, which originally required memory proportional to the number of nucleotides in the neighborhood. This does not scale well.
Here, we describe a new version of Fur that only requires memory proportional to the longest neighbor. In spite of its greater memory efficiency, the new Fur remains fast and is accurate. We demonstrate this by applying it to simulated sequences and comparing it to an efficient alternative. Then we use the new Fur to extract markers from 120 reference bacteria. To make this feasible, we also introduce software for automatically finding target and neighbor genomes and for assessing markers. We pick the best primers from the 10 most sequenced reference bacteria and show their excellent sensitivity and specificity.
Fur is available from github.com/evolbioinf/fur, in the Docker image hub.docker.com/r/beatrizvm/mapro, and in the Code Ocean capsule 10.24433/CO.7955947.v1.
用于诊断聚合酶链反应的标记通常通过获取目标生物体基因组共有的区域,并减去其最接近的亲属(即邻居)基因组中发现的区域来构建。已发布的软件包Fur中实现了这种方法,该软件包最初所需的内存与邻域中的核苷酸数量成正比。这种方法扩展性不佳。
在此,我们描述了Fur的一个新版本,它只需要与最长邻居成正比的内存。尽管新的Fur内存效率更高,但它仍然快速且准确。我们通过将其应用于模拟序列并与一种高效的替代方法进行比较来证明这一点。然后我们使用新的Fur从120种参考细菌中提取标记。为了使其可行,我们还引入了用于自动查找目标和邻居基因组以及评估标记的软件。我们从测序最多的10种参考细菌中挑选出最佳引物,并展示了它们出色的灵敏度和特异性。
Fur可从github.com/evolbioinf/fur获取,也可在Docker镜像hub.docker.com/r/beatrizvm/mapro以及Code Ocean胶囊10.24433/CO.7955947.v1中获取。