• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

StableLift:跨基因组版本的优化种系和体细胞变异检测

StableLift: Optimized Germline and Somatic Variant Detection Across Genome Builds.

作者信息

Wang Nicholas K, Wiltsie Nicholas, Winata Helena K, Fitz-Gibbon Sorel, Gonzalez Alfredo E, Zeltser Nicole, Agrawal Raag, Oh Jieun, Arbet Jaron, Patel Yash, Yamaguchi Takafumi N, Boutros Paul C

机构信息

Department of Human Genetics, University of California, Los Angeles.

Jonsson Comprehensive Cancer Center, University of California, Los Angeles.

出版信息

bioRxiv. 2024 Nov 3:2024.10.31.621401. doi: 10.1101/2024.10.31.621401.

DOI:10.1101/2024.10.31.621401
PMID:39554127
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11565985/
Abstract

Reference genomes are foundational to modern genomics. Our growing understanding of genome structure leads to continual improvements in reference genomes and new genome "builds" with incompatible coordinate systems. We quantified the impact of genome build on germline and somatic variant calling by analyzing tumour-normal whole-genome pairs against the two most widely used human genome builds. The average individual had a build-discordance of 3.8% for germline SNPs, 8.6% for germline SVs, 25.9% for somatic SNVs and 49.6% for somatic SVs. Build-discordant variants are not simply false-positives: 47% were verified by targeted resequencing. Build-discordant variants were associated with specific genomic and technical features in variant- and algorithm-specific patterns. We leveraged these patterns to create StableLift, an algorithm that predicts cross-build stability with AUROCs of 0.934 ± 0.029. These results call for significant caution in cross-build analyses and for use of StableLift as a computationally efficient solution to mitigate inter-build artifacts.

摘要

参考基因组是现代基因组学的基础。我们对基因组结构的不断深入理解促使参考基因组持续改进,并产生了具有不兼容坐标系统的新基因组“版本”。我们通过针对两种使用最广泛的人类基因组版本分析肿瘤-正常全基因组对,量化了基因组版本对种系和体细胞变异检测的影响。平均个体的种系单核苷酸多态性(SNP)的版本不一致率为3.8%,种系结构变异(SV)为8.6%,体细胞单核苷酸变异(SNV)为25.9%,体细胞SV为49.6%。版本不一致的变异并非简单的假阳性:47%通过靶向重测序得到验证。版本不一致的变异与特定的基因组和技术特征呈现变异及算法特异性模式相关。我们利用这些模式创建了StableLift算法,该算法预测跨版本稳定性的曲线下面积(AUROC)为0.934±0.029。这些结果警示在跨版本分析中要格外谨慎,并建议使用StableLift作为一种计算效率高的解决方案来减轻版本间的人为因素影响。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a531/11565985/339c0fb113c9/nihpp-2024.10.31.621401v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a531/11565985/03e2a3b46851/nihpp-2024.10.31.621401v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a531/11565985/339c0fb113c9/nihpp-2024.10.31.621401v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a531/11565985/03e2a3b46851/nihpp-2024.10.31.621401v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a531/11565985/339c0fb113c9/nihpp-2024.10.31.621401v1-f0002.jpg

相似文献

1
StableLift: Optimized Germline and Somatic Variant Detection Across Genome Builds.StableLift:跨基因组版本的优化种系和体细胞变异检测
bioRxiv. 2024 Nov 3:2024.10.31.621401. doi: 10.1101/2024.10.31.621401.
2
Germline contamination and leakage in whole genome somatic single nucleotide variant detection.全基因组体细胞单核苷酸变异检测中的种系污染和渗漏。
BMC Bioinformatics. 2018 Jan 31;19(1):28. doi: 10.1186/s12859-018-2046-0.
3
Using VarScan 2 for Germline Variant Calling and Somatic Mutation Detection.使用VarScan 2进行种系变异检测和体细胞突变检测。
Curr Protoc Bioinformatics. 2013 Dec;44:15.4.1-17. doi: 10.1002/0471250953.bi1504s44.
4
Converting single nucleotide variants between genome builds: from cautionary tale to solution.在基因组构建之间转换单核苷酸变异:从警示故事到解决方案。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab069.
5
SNVSniffer: an integrated caller for germline and somatic single-nucleotide and indel mutations.SNVSniffer:一种用于种系和体细胞单核苷酸及插入缺失突变的综合检测工具。
BMC Syst Biol. 2016 Aug 1;10 Suppl 2(Suppl 2):47. doi: 10.1186/s12918-016-0300-5.
6
GASOLINE: detecting germline and somatic structural variants from long-reads data.GASOLINE:从长读数据中检测种系和体细胞结构变体。
Sci Rep. 2023 Nov 27;13(1):20817. doi: 10.1038/s41598-023-48285-0.
7
Short and long-read genome sequencing methodologies for somatic variant detection; genomic analysis of a patient with diffuse large B-cell lymphoma.短读长读基因组测序方法用于体细胞变异检测;弥漫性大 B 细胞淋巴瘤患者的基因组分析。
Sci Rep. 2021 Mar 19;11(1):6408. doi: 10.1038/s41598-021-85354-8.
8
Leveraging Spatial Variation in Tumor Purity for Improved Somatic Variant Calling of Archival Tumor Only Samples.利用肿瘤纯度的空间变异改进仅存档肿瘤样本的体细胞变异检测
Front Oncol. 2019 Mar 20;9:119. doi: 10.3389/fonc.2019.00119. eCollection 2019.
9
Pan-cancer analysis reveals technical artifacts in TCGA germline variant calls.泛癌分析揭示了TCGA种系变异调用中的技术假象。
BMC Genomics. 2017 Jun 12;18(1):458. doi: 10.1186/s12864-017-3770-y.
10
Benchmarking long-read structural variant calling tools and combinations for detecting somatic variants in cancer genomes.评估用于检测癌症基因组中体细胞变异的长读长结构变异检测工具及组合。
Sci Rep. 2025 Mar 13;15(1):8707. doi: 10.1038/s41598-025-92750-x.

本文引用的文献

1
NFTest: automated testing of Nextflow pipelines.NFTest:用于 Nextflow 管道的自动化测试。
Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae081.
2
BCFtools/liftover: an accurate and comprehensive tool to convert genetic variants across genome assemblies.BCFtools/liftover:一种准确全面的工具,可跨基因组组装转换遗传变异。
Bioinformatics. 2024 Jan 2;40(2). doi: 10.1093/bioinformatics/btae038.
3
A genomic mutational constraint map using variation in 76,156 human genomes.基于 76156 个人类基因组的变异,绘制出基因组突变约束图谱。
Nature. 2024 Jan;625(7993):92-100. doi: 10.1038/s41586-023-06045-0. Epub 2023 Dec 6.
4
Evaluation of Liftover Tools for the Conversion of Genome Reference Consortium Human Build 37 to Build 38 Using ClinVar Variants.利用 ClinVar 变异评估基因组参考联盟人类构建 37 版到构建 38 版的基因转换时的 Liftover 工具。
Genes (Basel). 2023 Sep 26;14(10):1875. doi: 10.3390/genes14101875.
5
Pangenome graph construction from genome alignments with Minigraph-Cactus.基于 Minigraph-Cactus 的基因组比对构建泛基因组图谱。
Nat Biotechnol. 2024 Apr;42(4):663-673. doi: 10.1038/s41587-023-01793-w. Epub 2023 May 10.
6
Genenames.org: the HGNC resources in 2023.Genenames.org:2023 年的 HGNC 资源。
Nucleic Acids Res. 2023 Jan 6;51(D1):D1003-D1009. doi: 10.1093/nar/gkac888.
7
MuSE: A Novel Approach to Mutation Calling with Sample-Specific Error Modeling.MuSE:一种具有样本特异性错误建模的新突变调用方法。
Methods Mol Biol. 2022;2493:21-27. doi: 10.1007/978-1-0716-2293-3_2.
8
A complete reference genome improves analysis of human genetic variation.完整的参考基因组提高了人类遗传变异分析的能力。
Science. 2022 Apr;376(6588):eabl3533. doi: 10.1126/science.abl3533. Epub 2022 Apr 1.
9
The complete sequence of a human genome.人类基因组的完整序列。
Science. 2022 Apr;376(6588):44-53. doi: 10.1126/science.abj6987. Epub 2022 Mar 31.
10
Pangenomics enables genotyping of known structural variants in 5202 diverse genomes.泛基因组学能够对 5202 个不同基因组中的已知结构变异进行基因分型。
Science. 2021 Dec 17;374(6574):abg8871. doi: 10.1126/science.abg8871.