• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于评估结构变异对健康和疾病影响的框架。

A framework to score the effects of structural variants in health and disease.

机构信息

Berlin Institute of Health (BIH) at Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany.

Institute of Human Genetics, University Medical Center Schleswig-Holstein, University of Lübeck, 23562 Lübeck, Germany.

出版信息

Genome Res. 2022 Apr;32(4):766-777. doi: 10.1101/gr.275995.121. Epub 2022 Feb 23.

DOI:10.1101/gr.275995.121
PMID:35197310
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8997355/
Abstract

Although technological advances improved the identification of structural variants (SVs) in the human genome, their interpretation remains challenging. Several methods utilize individual mechanistic principles like the deletion of coding sequence or 3D genome architecture disruptions. However, a comprehensive tool using the broad spectrum of available annotations is missing. Here, we describe CADD-SV, a method to retrieve and integrate a wide set of annotations to predict the effects of SVs. Previously, supervised learning approaches were limited due to a small number and biased set of annotated pathogenic or benign SVs. We overcome this problem by using a surrogate training objective, the Combined Annotation Dependent Depletion (CADD) of functional variants. We use human- and chimpanzee-derived SVs as proxy-neutral and contrast them with matched simulated variants as proxy-deleterious, an approach that has proven powerful for short sequence variants. Our tool computes summary statistics over diverse variant annotations and uses random forest models to prioritize deleterious structural variants. The resulting CADD-SV scores correlate with known pathogenic and rare population variants. We further show that we can prioritize somatic cancer variants as well as noncoding variants known to affect gene expression. We provide a website and offline-scoring tool for easy application of CADD-SV.

摘要

尽管技术进步提高了人类基因组中结构变异(SV)的识别能力,但它们的解释仍然具有挑战性。有几种方法利用单个机制原理,如编码序列的缺失或三维基因组结构的破坏。然而,缺乏一种使用广泛可用注释的综合工具。在这里,我们描述了 CADD-SV,这是一种检索和整合广泛注释以预测 SV 影响的方法。以前,由于标记的致病性或良性 SV 数量较少且存在偏差,监督学习方法受到限制。我们通过使用替代训练目标,即功能变体的综合注释依赖耗竭(CADD)来克服这个问题。我们使用人类和黑猩猩来源的 SV 作为代理中立,并将其与匹配的模拟变体进行对比作为代理有害,这种方法已被证明对短序列变体非常有效。我们的工具对各种变体注释进行汇总统计,并使用随机森林模型来优先考虑有害的结构变体。由此产生的 CADD-SV 分数与已知的致病性和罕见的人群变体相关。我们进一步表明,我们可以优先考虑体细胞癌症变体以及已知影响基因表达的非编码变体。我们提供了一个网站和离线评分工具,以便于 CADD-SV 的应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b8b/8997355/bc8586d7ce80/766f05.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b8b/8997355/23f0dec54237/766f01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b8b/8997355/46c240750852/766f02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b8b/8997355/9aebc3983766/766f03.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b8b/8997355/f1ed287664a5/766f04.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b8b/8997355/bc8586d7ce80/766f05.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b8b/8997355/23f0dec54237/766f01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b8b/8997355/46c240750852/766f02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b8b/8997355/9aebc3983766/766f03.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b8b/8997355/f1ed287664a5/766f04.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b8b/8997355/bc8586d7ce80/766f05.jpg

相似文献

1
A framework to score the effects of structural variants in health and disease.一种用于评估结构变异对健康和疾病影响的框架。
Genome Res. 2022 Apr;32(4):766-777. doi: 10.1101/gr.275995.121. Epub 2022 Feb 23.
2
StrVCTVRE: A supervised learning method to predict the pathogenicity of human genome structural variants.StrVCTVRE:一种用于预测人类基因组结构变异致病性的监督学习方法。
Am J Hum Genet. 2022 Feb 3;109(2):195-209. doi: 10.1016/j.ajhg.2021.12.007. Epub 2022 Jan 14.
3
nanotatoR: a tool for enhanced annotation of genomic structural variants.纳米标记器:一种用于增强基因组结构变异注释的工具。
BMC Genomics. 2021 Jan 6;22(1):10. doi: 10.1186/s12864-020-07182-w.
4
CADD: predicting the deleteriousness of variants throughout the human genome.CADD:预测整个人类基因组中变异的有害性。
Nucleic Acids Res. 2019 Jan 8;47(D1):D886-D894. doi: 10.1093/nar/gky1016.
5
A general framework for estimating the relative pathogenicity of human genetic variants.一种用于估计人类遗传变异相对致病性的通用框架。
Nat Genet. 2014 Mar;46(3):310-5. doi: 10.1038/ng.2892. Epub 2014 Feb 2.
6
svclassify: a method to establish benchmark structural variant calls.svclassify:一种建立基准结构变异调用的方法。
BMC Genomics. 2016 Jan 16;17:64. doi: 10.1186/s12864-016-2366-2.
7
CADD-Splice-improving genome-wide variant effect prediction using deep learning-derived splice scores.使用深度学习衍生的剪接分数提高 CADD-Splice 全基因组变异效应预测。
Genome Med. 2021 Feb 22;13(1):31. doi: 10.1186/s13073-021-00835-9.
8
Annotation of structural variants with reported allele frequencies and related metrics from multiple datasets using SVAFotate.使用 SVAFotate 对来自多个数据集的具有报道等位基因频率和相关指标的结构变异进行注释。
BMC Bioinformatics. 2022 Nov 16;23(1):490. doi: 10.1186/s12859-022-05008-y.
9
PhenoSV: interpretable phenotype-aware model for the prioritization of genes affected by structural variants.PhenoSV:一种可解释的表型感知模型,用于优先考虑受结构变异影响的基因。
Nat Commun. 2023 Nov 28;14(1):7805. doi: 10.1038/s41467-023-43651-y.
10
CADD v1.7: using protein language models, regulatory CNNs and other nucleotide-level scores to improve genome-wide variant predictions.CADD v1.7:利用蛋白质语言模型、调控 CNN 以及其他核苷酸水平的评分来提高全基因组变异预测的准确性。
Nucleic Acids Res. 2024 Jan 5;52(D1):D1143-D1154. doi: 10.1093/nar/gkad989.

引用本文的文献

1
Pangenome discovery of missing autism variants.自闭症缺失变异体的泛基因组发现。
medRxiv. 2025 Jul 22:2025.07.21.25331932. doi: 10.1101/2025.07.21.25331932.
2
Structural variants in the 3D genome as drivers of disease.三维基因组中的结构变异作为疾病的驱动因素。
Nat Rev Genet. 2025 Jun 30. doi: 10.1038/s41576-025-00862-x.
3
A Hitchhiker's Guide to long-read genomic analysis.长读长基因组分析指南

本文引用的文献

1
StrVCTVRE: A supervised learning method to predict the pathogenicity of human genome structural variants.StrVCTVRE:一种用于预测人类基因组结构变异致病性的监督学习方法。
Am J Hum Genet. 2022 Feb 3;109(2):195-209. doi: 10.1016/j.ajhg.2021.12.007. Epub 2022 Jan 14.
2
Long-read sequencing of 3,622 Icelanders provides insight into the role of structural variants in human diseases and other traits.对 3622 名冰岛人的长读测序为深入了解结构变异在人类疾病和其他特征中的作用提供了线索。
Nat Genet. 2021 Jun;53(6):779-786. doi: 10.1038/s41588-021-00865-4. Epub 2021 May 10.
3
Heterozygous ANKRD17 loss-of-function variants cause a syndrome with intellectual disability, speech delay, and dysmorphism.
Genome Res. 2025 Apr 14;35(4):545-558. doi: 10.1101/gr.279975.124.
4
Integration of transcriptomics and long-read genomics prioritizes structural variants in rare disease.转录组学与长读长基因组学的整合确定了罕见病中的结构变异优先级。
Genome Res. 2025 Apr 14;35(4):914-928. doi: 10.1101/gr.279323.124.
5
Unraveling the hidden complexity of cancer through long-read sequencing.通过长读长测序揭示癌症隐藏的复杂性。
Genome Res. 2025 Apr 14;35(4):599-620. doi: 10.1101/gr.280041.124.
6
Rare pathogenic structural variants show potential to enhance prostate cancer germline testing for African men.罕见的致病性结构变异显示出增强非洲男性前列腺癌种系检测的潜力。
Nat Commun. 2025 Mar 10;16(1):2400. doi: 10.1038/s41467-025-57312-9.
7
Quantifying the regulatory potential of genetic variants via a hybrid sequence-oriented model with SVEN.通过基于序列的混合模型SVEN对基因变异的调控潜力进行量化。
Nat Commun. 2024 Dec 30;15(1):10917. doi: 10.1038/s41467-024-55392-7.
8
Systematic assessment of structural variant annotation tools for genomic interpretation.用于基因组解读的结构变异注释工具的系统评估。
Life Sci Alliance. 2024 Dec 10;8(3). doi: 10.26508/lsa.202402949. Print 2025 Mar.
9
Misexpression of inactive genes in whole blood is associated with nearby rare structural variants.全血中无活性基因的异常表达与附近罕见的结构变异有关。
Am J Hum Genet. 2024 Aug 8;111(8):1524-1543. doi: 10.1016/j.ajhg.2024.06.017. Epub 2024 Jul 24.
10
Comprehensive and deep evaluation of structural variation detection pipelines with third-generation sequencing data.第三代测序数据结构变异检测管道的全面深入评估。
Genome Biol. 2024 Jul 15;25(1):188. doi: 10.1186/s13059-024-03324-5.
杂合性 ANKRD17 功能丧失变异导致伴有智力残疾、言语延迟和发育异常的综合征。
Am J Hum Genet. 2021 Jun 3;108(6):1138-1150. doi: 10.1016/j.ajhg.2021.04.007. Epub 2021 Apr 27.
4
Haplotype-resolved diverse human genomes and integrated analysis of structural variation.单体型解析的多样化人类基因组和结构变异的综合分析。
Science. 2021 Apr 2;372(6537). doi: 10.1126/science.abf7117. Epub 2021 Feb 25.
5
Enhancer hijacking determines extrachromosomal circular MYCN amplicon architecture in neuroblastoma.增强子劫持决定神经母细胞瘤中额外染色体环状 MYCN 扩增子的结构。
Nat Commun. 2020 Nov 16;11(1):5823. doi: 10.1038/s41467-020-19452-y.
6
SVFX: a machine learning framework to quantify the pathogenicity of structural variants.SVFX:一种用于量化结构变异致病性的机器学习框架。
Genome Biol. 2020 Nov 9;21(1):274. doi: 10.1186/s13059-020-02178-x.
7
DeepC: predicting 3D genome folding using megabase-scale transfer learning.DeepC:使用兆碱基规模的迁移学习预测 3D 基因组折叠。
Nat Methods. 2020 Nov;17(11):1118-1124. doi: 10.1038/s41592-020-0960-3. Epub 2020 Oct 12.
8
A structural variation reference for medical and population genetics.医学和人群遗传学的结构变异参考
Nature. 2020 May;581(7809):444-451. doi: 10.1038/s41586-020-2287-8. Epub 2020 May 27.
9
Mapping and characterization of structural variation in 17,795 human genomes.人类基因组 17795 号结构变异的定位与特征分析。
Nature. 2020 Jul;583(7814):83-89. doi: 10.1038/s41586-020-2371-0. Epub 2020 May 27.
10
Pan-cancer analysis of whole genomes.泛癌症全基因组分析。
Nature. 2020 Feb;578(7793):82-93. doi: 10.1038/s41586-020-1969-6. Epub 2020 Feb 5.