Jiang Tao, Cao Shuqi, Liu Yadong, Zhang Zhendong, Liu Bo, Luo Ruibang, Wang Guohua, Wang Yadong
Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang, 150001, China.
Zhengzhou Research Institute, Harbin Institute of Technology, Zhengzhou, 450000, Henan, China.
Genome Biol. 2025 Jun 13;26(1):166. doi: 10.1186/s13059-025-03642-2.
Long-read sequencing technologies have great potential for the comprehensive discovery of structural variations (SVs). However, accurate genotype assignment for SVs remains challenging due to unavoidable sequencing errors, limited coverage, and the complexity of SVs. Herein, we propose cuteFC, which employs self-adaptive clustering along with a multiallele-aware clustering to achieve accurate SV regenotyping through a force-calling approach. cuteFC also applies a Genome Position Scanner algorithm to improve its application efficiency. Benchmarking evaluations demonstrate that cuteFC outperforms state-of-the-art methods with 2-5% higher F1 scores and constructs a higher-quality genomic atlas with minimal computational resources. cuteFC is available at https://github.com/Meltpinkg/cuteFC and https://zenodo.org/records/14671406 .
长读长测序技术在全面发现结构变异(SVs)方面具有巨大潜力。然而,由于不可避免的测序错误、有限的覆盖范围以及SVs的复杂性,准确进行SVs的基因型分型仍然具有挑战性。在此,我们提出了cuteFC,它采用自适应聚类以及多等位基因感知聚类,通过强制分型方法实现准确的SV重新基因分型。cuteFC还应用了基因组位置扫描算法来提高其应用效率。基准评估表明,cuteFC的表现优于现有方法,F1分数高出2 - 5%,并以最少的计算资源构建了更高质量的基因组图谱。可在https://github.com/Meltpinkg/cuteFC和https://zenodo.org/records/14671406获取cuteFC。