• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SVLearn:一种双参考机器学习方法可实现结构变异的准确跨物种基因分型。

SVLearn: a dual-reference machine learning approach enables accurate cross-species genotyping of structural variants.

作者信息

Yang Qimeng, Sun Jianfeng, Wang Xinyu, Wang Jiong, Liu Quanzhong, Ru Jinlong, Zhang Xin, Wang Sizhe, Hao Ran, Bian Peipei, Dai Xuelei, Gong Mian, Zhang Zhuangbiao, Wang Ao, Bai Fengting, Li Ran, Cai Yudong, Jiang Yu

机构信息

Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China.

Botnar Research Centre, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, UK.

出版信息

Nat Commun. 2025 Mar 11;16(1):2406. doi: 10.1038/s41467-025-57756-z.

DOI:10.1038/s41467-025-57756-z
PMID:40069188
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11897243/
Abstract

Structural variations (SVs) are diverse forms of genetic alterations and drive a wide range of human diseases. Accurately genotyping SVs, particularly occurring at repetitive genomic regions, from short-read sequencing data remains challenging. Here, we introduce SVLearn, a machine-learning approach for genotyping bi-allelic SVs. It exploits a dual-reference strategy to engineer a curated set of genomic, alignment, and genotyping features based on a reference genome in concert with an allele-based alternative genome. Using 38,613 human-derived SVs, we show that SVLearn significantly outperforms four state-of-the-art tools, with precision improvements of up to 15.61% for insertions and 13.75% for deletions in repetitive regions. On two additional sets of 121,435 cattle SVs and 113,042 sheep SVs, SVLearn demonstrates a strong generalizability to cross-species genotype SVs with a weighted genotype concordance score of up to 90%. Notably, SVLearn enables accurate genotyping of SVs at low sequencing coverage, which is comparable to the accuracy at 30× coverage. Our studies suggest that SVLearn can accelerate the understanding of associations between the genome-scale, high-quality genotyped SVs and diseases across multiple species.

摘要

结构变异(SVs)是多种形式的基因改变,可引发多种人类疾病。从短读长测序数据中准确地对SVs进行基因分型,尤其是在重复基因组区域发生的SVs,仍然具有挑战性。在此,我们介绍了SVLearn,一种用于对双等位基因SVs进行基因分型的机器学习方法。它采用双参考策略,基于参考基因组并结合基于等位基因的替代基因组,构建一组经过精心策划的基因组、比对和基因分型特征。使用38,613个人源SVs,我们表明SVLearn显著优于四种最先进的工具,在重复区域中,插入的精确率提高了高达15.61%,缺失的精确率提高了13.75%。在另外两组分别为121,435个牛SVs和113,042个羊SVs的数据集上,SVLearn展示了强大的跨物种基因分型SVs的通用性,加权基因型一致性得分高达90%。值得注意的是,SVLearn能够在低测序覆盖度下对SVs进行准确的基因分型,其准确性与30×覆盖度时相当。我们的研究表明,SVLearn可以加速对跨多个物种的基因组规模、高质量基因分型SVs与疾病之间关联的理解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2253/11897243/1be6d94477c7/41467_2025_57756_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2253/11897243/92093469e26d/41467_2025_57756_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2253/11897243/baca614f25f7/41467_2025_57756_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2253/11897243/e6b2ec9988df/41467_2025_57756_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2253/11897243/698f92b6134e/41467_2025_57756_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2253/11897243/1be6d94477c7/41467_2025_57756_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2253/11897243/92093469e26d/41467_2025_57756_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2253/11897243/baca614f25f7/41467_2025_57756_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2253/11897243/e6b2ec9988df/41467_2025_57756_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2253/11897243/698f92b6134e/41467_2025_57756_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2253/11897243/1be6d94477c7/41467_2025_57756_Fig5_HTML.jpg

相似文献

1
SVLearn: a dual-reference machine learning approach enables accurate cross-species genotyping of structural variants.SVLearn:一种双参考机器学习方法可实现结构变异的准确跨物种基因分型。
Nat Commun. 2025 Mar 11;16(1):2406. doi: 10.1038/s41467-025-57756-z.
2
GGTyper: genotyping complex structural variants using short-read sequencing data.GGTyper:使用短读测序数据进行基因分型复杂结构变异。
Bioinformatics. 2024 Sep 1;40(Suppl 2):ii11-ii19. doi: 10.1093/bioinformatics/btae391.
3
A large structural variant collection in Holstein cattle and associated database for variant discovery, characterization, and application.荷斯坦牛大型结构变异组库及相关数据库的建立,用于变异的发现、鉴定和应用。
BMC Genomics. 2024 Sep 30;25(1):903. doi: 10.1186/s12864-024-10812-2.
4
K-mer analysis of long-read alignment pileups for structural variant genotyping.用于结构变异基因分型的长读长比对堆积的K-mer分析。
Nat Commun. 2025 Apr 4;16(1):3218. doi: 10.1038/s41467-025-58577-w.
5
NPSV: A simulation-driven approach to genotyping structural variants in whole-genome sequencing data.NPSV:一种基于模拟的全基因组测序数据分析中结构变异基因分型方法。
Gigascience. 2021 Jul 1;10(7). doi: 10.1093/gigascience/giab046.
6
Sawfish: improving long-read structural variant discovery and genotyping with local haplotype modeling.锯鳐:利用局部单倍型建模改进长读长结构变异发现和基因分型
Bioinformatics. 2025 Mar 29;41(4). doi: 10.1093/bioinformatics/btaf136.
7
NPSV-deep: a deep learning method for genotyping structural variants in short read genome sequencing data.NPSV-deep:一种用于在短读长基因组测序数据中进行基因分型结构变体的深度学习方法。
Bioinformatics. 2024 Mar 4;40(3). doi: 10.1093/bioinformatics/btae129.
8
High-resolution structural variants catalogue in a large-scale whole genome sequenced bovine family cohort data.大规模全基因组测序牛科家系队列数据中的高分辨率结构变异目录。
BMC Genomics. 2023 May 1;24(1):225. doi: 10.1186/s12864-023-09259-8.
9
GraphTyper2 enables population-scale genotyping of structural variation using pangenome graphs.GraphTyper2 可使用泛基因组图谱实现基于群体的结构变异基因分型。
Nat Commun. 2019 Nov 27;10(1):5402. doi: 10.1038/s41467-019-13341-9.
10
Characterizing the Major Structural Variant Alleles of the Human Genome.人类基因组主要结构变异等位基因的特征。
Cell. 2019 Jan 24;176(3):663-675.e19. doi: 10.1016/j.cell.2018.12.019. Epub 2019 Jan 17.

引用本文的文献

1
Structural Variants: Mechanisms, Mapping, and Interpretation in Human Genetics.结构变异:人类遗传学中的机制、定位与解读
Genes (Basel). 2025 Jul 29;16(8):905. doi: 10.3390/genes16080905.

本文引用的文献

1
Taurine pangenome uncovers a segmental duplication upstream of associated with depigmentation in white-headed cattle.牛磺酸泛基因组揭示了与白头牛色素脱失相关的上游片段重复。
Genome Res. 2025 Apr 14;35(4):1041-1052. doi: 10.1101/gr.279064.124.
2
Mapping and functional characterization of structural variation in 1060 pig genomes.对 1060 个猪基因组结构变异的作图和功能特征分析。
Genome Biol. 2024 May 7;25(1):116. doi: 10.1186/s13059-024-03253-3.
3
A comprehensive benchmark of graph-based genetic variant genotyping algorithms on plant genomes for creating an accurate ensemble pipeline.
基于图的遗传变异基因分型算法在植物基因组上的综合基准测试,用于创建一个准确的综合管道。
Genome Biol. 2024 Apr 8;25(1):91. doi: 10.1186/s13059-024-03239-1.
4
Genetic variation across and within individuals.个体间和个体内的基因变异。
Nat Rev Genet. 2024 Aug;25(8):548-562. doi: 10.1038/s41576-024-00709-x. Epub 2024 Mar 28.
5
Pangenome-genotyped structural variation improves molecular phenotype mapping in cattle.泛基因组基因分型结构变异提高了牛的分子表型图谱绘制。
Genome Res. 2024 Mar 20;34(2):300-309. doi: 10.1101/gr.278267.123.
6
Detection of mosaic and population-level structural variants with Sniffles2.使用 Sniffles2 检测嵌合体和群体水平的结构变异。
Nat Biotechnol. 2024 Oct;42(10):1571-1580. doi: 10.1038/s41587-023-02024-y. Epub 2024 Jan 2.
7
Population Structure and Genetic Diversity of Yunling Cattle Determined by Whole-Genome Resequencing.基于全基因组重测序的云岭牛群体结构与遗传多样性分析。
Genes (Basel). 2023 Nov 27;14(12):2141. doi: 10.3390/genes14122141.
8
Structural variants involved in high-altitude adaptation detected using single-molecule long-read sequencing.利用单分子长读测序技术检测到与高原适应相关的结构变异。
Nat Commun. 2023 Dec 13;14(1):8282. doi: 10.1038/s41467-023-44034-z.
9
De novo genome assembly depicts the immune genomic characteristics of cattle.从头组装基因组描绘了牛的免疫基因组特征。
Nat Commun. 2023 Oct 19;14(1):6601. doi: 10.1038/s41467-023-42161-1.
10
Evolutionary origin of genomic structural variations in domestic yaks.家牦牛基因组结构变异的进化起源。
Nat Commun. 2023 Sep 19;14(1):5617. doi: 10.1038/s41467-023-41220-x.