Suppr超能文献

如何处理异源多倍体草莓 Fragaria x ananassa 中的高亚基因组序列相似性:基于连锁不平衡的变异过滤。

How to handle high subgenome sequence similarity in allopolyploid Fragaria x ananassa: linkage disequilibrium based variant filtering.

机构信息

Fresh Forward Breeding B.V., Huissen, The Netherlands.

Wageningen University and Research Plant Breeding, Wageningen, The Netherlands.

出版信息

BMC Genomics. 2024 Nov 28;25(1):1150. doi: 10.1186/s12864-024-10987-8.

Abstract

BACKGROUND

The allo-octoploid Fragaria x ananassa follows disomic inheritance, yet the high sequence similarity among its subgenomes can lead to misalignment of short sequencing reads (150 bp). This misalignment results in an increased number of erroneous variants during variant calling. To accurately associate traits with the appropriate subgenome, it is essential to filter out these erroneous variants. By classifying variants into correct (type 1) and erroneous types (homoeologous variants-type 2, and multi-locus variants-type 3), we can improve the reliability of downstream analyses.

RESULTS

Our analysis reveals that while erroneous variant types often display skewed average allele balances (AAB) for heterozygous calls, this measure alone is insufficient. To mitigate the erroneous variants further, we employed a Linkage Disequilibrium (LD) based filtering method that correlates highly (99%) with an approach that utilizes a genetic map from a biparental population. This combined filtering strategy-using both LD-based and average allele balance methods-resulted in the lowest switch error rate (0.037). Notably, our best filtering approach decreased phasing switch error rates by 44% and preserved 72% of the original dataset.

CONCLUSIONS

The results indicate that identifying erroneous variants due to subgenome similarity can be effectively achieved without extensive genotyping of mapping populations. By implementing the LD-based filtering method, the phasing accuracy improved which improves the tracability of important alleles in the germplasm, paving the way for better understanding of trait associations in F. x ananassa.

摘要

背景

异源八倍体草莓(Fragaria x ananassa)遵循二倍体遗传,但亚基因组间的高度序列相似性会导致短测序reads(150bp)的错配。这种错配会导致在变异调用过程中产生更多错误的变异。为了准确地将性状与相应的亚基因组关联起来,过滤掉这些错误的变异是至关重要的。通过将变异分为正确的(类型 1)和错误的类型(同源变异-类型 2,和多位点变异-类型 3),我们可以提高下游分析的可靠性。

结果

我们的分析表明,虽然错误变异类型通常显示杂合子调用的偏斜平均等位基因平衡(AAB),但仅这一指标是不够的。为了进一步减少错误变异,我们采用了基于连锁不平衡(LD)的过滤方法,该方法与利用双亲种群遗传图谱的方法高度相关(99%)。这种基于 LD 和平均等位基因平衡方法的联合过滤策略导致最低的开关错误率(0.037)。值得注意的是,我们最好的过滤方法将相位开关错误率降低了 44%,并保留了原始数据集的 72%。

结论

结果表明,通过实施基于 LD 的过滤方法,在不需要对作图群体进行广泛基因分型的情况下,可以有效地识别由于亚基因组相似性导致的错误变异。这提高了相位准确性,提高了种质中重要等位基因的可追踪性,为更好地理解草莓 F. x ananassa 中的性状关联铺平了道路。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6eb/11606298/6b3b6ed08c90/12864_2024_10987_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验