• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用迁移学习准确地从外显子组测序数据中确认罕见拷贝数变异的调用。

Accurate in silico confirmation of rare copy number variant calls from exome sequencing data using transfer learning.

机构信息

Department of Systems Biology, Columbia University, New York, NY 10032, USA.

Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA.

出版信息

Nucleic Acids Res. 2022 Nov 28;50(21):e123. doi: 10.1093/nar/gkac788.

DOI:10.1093/nar/gkac788
PMID:36124672
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9756945/
Abstract

Exome sequencing is widely used in genetic studies of human diseases and clinical genetic diagnosis. Accurate detection of copy number variants (CNVs) is important to fully utilize exome sequencing data. However, exome data are noisy. None of the existing methods alone can achieve both high precision and recall rate. A common practice is to perform heuristic filtration followed by manual inspection of read depth of putative CNVs. This approach does not scale in large studies. To address this issue, we developed a transfer learning method, CNV-espresso, for in silico confirming rare CNVs from exome sequencing data. CNV-espresso encodes candidate CNVs from exome data as images and uses pretrained convolutional neural network models to classify copy number states. We trained CNV-espresso using an offspring-parents trio exome sequencing dataset, with inherited CNVs as positives and CNVs with Mendelian errors as negatives. We evaluated the performance using additional samples that have both exome and whole-genome sequencing (WGS) data. Assuming the CNVs detected from WGS data as a proxy of ground truth, CNV-espresso significantly improves precision while keeping recall almost intact, especially for CNVs that span a small number of exons. CNV-espresso can effectively replace manual inspection of CNVs in large-scale exome sequencing studies.

摘要

外显子组测序广泛应用于人类疾病的遗传学研究和临床遗传诊断。准确检测拷贝数变异(CNVs)对于充分利用外显子组测序数据非常重要。然而,外显子组数据存在噪声。现有的方法都无法单独达到高精度和高召回率。一种常见的做法是进行启发式过滤,然后手动检查假定 CNVs 的读深度。这种方法在大型研究中无法扩展。为了解决这个问题,我们开发了一种迁移学习方法 CNV-espresso,用于从外显子组测序数据中推断罕见的 CNVs。CNV-espresso 将外显子数据中的候选 CNVs 编码为图像,并使用预先训练的卷积神经网络模型对拷贝数状态进行分类。我们使用具有遗传 CNVs 的亲子三体外显子组测序数据集来训练 CNV-espresso,将其作为阳性,而将具有 Mendelian 错误的 CNVs 作为阴性。我们使用具有外显子组和全基因组测序(WGS)数据的其他样本评估性能。假设从 WGS 数据中检测到的 CNVs 作为真实情况的代理,CNV-espresso 在保持召回率几乎不变的情况下,显著提高了精度,特别是对于跨越少数外显子的 CNVs。CNV-espresso 可以有效地替代在外显子组测序研究中对 CNVs 的手动检查。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f7fe/9756945/16c0dbdb77a6/gkac788fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f7fe/9756945/d3153384725b/gkac788fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f7fe/9756945/a710d9815401/gkac788fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f7fe/9756945/3642260cd525/gkac788fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f7fe/9756945/16c0dbdb77a6/gkac788fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f7fe/9756945/d3153384725b/gkac788fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f7fe/9756945/a710d9815401/gkac788fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f7fe/9756945/3642260cd525/gkac788fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f7fe/9756945/16c0dbdb77a6/gkac788fig4.jpg

相似文献

1
Accurate in silico confirmation of rare copy number variant calls from exome sequencing data using transfer learning.利用迁移学习准确地从外显子组测序数据中确认罕见拷贝数变异的调用。
Nucleic Acids Res. 2022 Nov 28;50(21):e123. doi: 10.1093/nar/gkac788.
2
ECOLE: Learning to call copy number variants on whole exome sequencing data.ECOLE:学习在全外显子组测序数据上调用拷贝数变异。
Nat Commun. 2024 Jan 2;15(1):132. doi: 10.1038/s41467-023-44116-y.
3
A machine-learning approach for accurate detection of copy number variants from exome sequencing.一种基于机器学习的方法,用于从外显子测序中准确检测拷贝数变异。
Genome Res. 2019 Jul;29(7):1134-1143. doi: 10.1101/gr.245928.118. Epub 2019 Jun 6.
4
Polishing copy number variant calls on exome sequencing data via deep learning.通过深度学习对外显子组测序数据进行拷贝数变异的精确分析。
Genome Res. 2022 Jun;32(6):1170-1182. doi: 10.1101/gr.274845.120. Epub 2022 Jun 13.
5
Exome copy number variant detection, analysis, and classification in a large cohort of families with undiagnosed rare genetic disease.在一大群未确诊罕见遗传病的家庭中进行外显子组拷贝数变异检测、分析和分类。
Am J Hum Genet. 2024 May 2;111(5):863-876. doi: 10.1016/j.ajhg.2024.03.008. Epub 2024 Apr 1.
6
Exome sequence read depth methods for identifying copy number changes.用于识别拷贝数变化的外显子序列读取深度方法。
Brief Bioinform. 2015 May;16(3):380-92. doi: 10.1093/bib/bbu027. Epub 2014 Aug 28.
7
Efficient detection of copy-number variations using exome data: Batch- and sex-based analyses.利用外显子组数据高效检测拷贝数变异:基于批次和性别的分析。
Hum Mutat. 2021 Jan;42(1):50-65. doi: 10.1002/humu.24129. Epub 2020 Nov 11.
8
Pre-capture multiplexing provides additional power to detect copy number variation in exome sequencing.预捕获多重分析为外显子测序中检测拷贝数变异提供了额外的功效。
BMC Bioinformatics. 2021 Jul 20;22(1):374. doi: 10.1186/s12859-021-04246-w.
9
Evaluation of somatic copy number estimation tools for whole-exome sequencing data.全外显子组测序数据的体细胞拷贝数估计工具评估
Brief Bioinform. 2016 Mar;17(2):185-92. doi: 10.1093/bib/bbv055. Epub 2015 Jul 25.
10
SavvyCNV: Genome-wide CNV calling from off-target reads.SavvyCNV:从脱靶reads 进行全基因组 CNV 调用。
PLoS Comput Biol. 2022 Mar 16;18(3):e1009940. doi: 10.1371/journal.pcbi.1009940. eCollection 2022 Mar.

引用本文的文献

1
Applications for Deep Learning in Epilepsy Genetic Research.深度学习在癫痫遗传学研究中的应用。
Int J Mol Sci. 2023 Sep 27;24(19):14645. doi: 10.3390/ijms241914645.
2
SVcnn: an accurate deep learning-based method for detecting structural variation based on long-read data.SVcnn:一种基于深度学习的准确检测基于长读数据的结构变异的方法。
BMC Bioinformatics. 2023 May 23;24(1):213. doi: 10.1186/s12859-023-05324-x.
3
Copy Number Variation and Osteoporosis.拷贝数变异与骨质疏松症。

本文引用的文献

1
A cross-disorder dosage sensitivity map of the human genome.人类基因组的跨疾病剂量敏感性图谱。
Cell. 2022 Aug 4;185(16):3041-3055.e25. doi: 10.1016/j.cell.2022.06.036. Epub 2022 Aug 1.
2
Structural variants are a major source of gene expression differences in humans and often affect multiple nearby genes.结构变异是人类基因表达差异的主要来源,并且常常影响多个邻近基因。
Genome Res. 2021 Dec;31(12):2249-2257. doi: 10.1101/gr.275488.121. Epub 2021 Sep 20.
3
How does genetic variation modify ND-CNV phenotypes?遗传变异如何修饰 ND-CNV 表型?
Curr Osteoporos Rep. 2023 Apr;21(2):167-172. doi: 10.1007/s11914-023-00773-y. Epub 2023 Feb 16.
Trends Genet. 2022 Feb;38(2):140-151. doi: 10.1016/j.tig.2021.07.006. Epub 2021 Aug 4.
4
Samplot: a platform for structural variant visual validation and automated filtering.Samplot:用于结构变异可视化验证和自动过滤的平台。
Genome Biol. 2021 May 25;22(1):161. doi: 10.1186/s13059-021-02380-5.
5
De novo structural mutation rates and gamete-of-origin biases revealed through genome sequencing of 2,396 families.通过对 2396 个家族的基因组测序揭示新的结构突变率和配子来源偏倚。
Am J Hum Genet. 2021 Apr 1;108(4):597-607. doi: 10.1016/j.ajhg.2021.02.012. Epub 2021 Mar 5.
6
MVP predicts the pathogenicity of missense variants by deep learning.MVP 通过深度学习预测错义变异的致病性。
Nat Commun. 2021 Jan 21;12(1):510. doi: 10.1038/s41467-020-20847-0.
7
DeepCNV: a deep learning approach for authenticating copy number variations.DeepCNV:一种用于认证拷贝数变异的深度学习方法。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbaa381.
8
A deep learning approach for filtering structural variants in short read sequencing data.深度学习方法在短读测序数据中过滤结构变异。
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa370.
9
Exome sequencing and characterization of 49,960 individuals in the UK Biobank.英国生物银行中 49960 人的外显子组测序和特征分析。
Nature. 2020 Oct;586(7831):749-756. doi: 10.1038/s41586-020-2853-0. Epub 2020 Oct 21.
10
Long-read human genome sequencing and its applications.长读长基因组测序及其应用。
Nat Rev Genet. 2020 Oct;21(10):597-614. doi: 10.1038/s41576-020-0236-x. Epub 2020 Jun 5.