• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

CODEX:一种用于全外显子组测序的标准化及拷贝数变异检测方法。

CODEX: a normalization and copy number variation detection method for whole exome sequencing.

作者信息

Jiang Yuchao, Oldridge Derek A, Diskin Sharon J, Zhang Nancy R

机构信息

Genomics and Computational Biology Graduate Program, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.

Medical Scientist Training Program, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA Division of Oncology and Center for Childhood Cancer Research, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.

出版信息

Nucleic Acids Res. 2015 Mar 31;43(6):e39. doi: 10.1093/nar/gku1363. Epub 2015 Jan 23.

DOI:10.1093/nar/gku1363
PMID:25618849
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4381046/
Abstract

High-throughput sequencing of DNA coding regions has become a common way of assaying genomic variation in the study of human diseases. Copy number variation (CNV) is an important type of genomic variation, but detecting and characterizing CNV from exome sequencing is challenging due to the high level of biases and artifacts. We propose CODEX, a normalization and CNV calling procedure for whole exome sequencing data. The Poisson latent factor model in CODEX includes terms that specifically remove biases due to GC content, exon capture and amplification efficiency, and latent systemic artifacts. CODEX also includes a Poisson likelihood-based recursive segmentation procedure that explicitly models the count-based exome sequencing data. CODEX is compared to existing methods on a population analysis of HapMap samples from the 1000 Genomes Project, and shown to be more accurate on three microarray-based validation data sets. We further evaluate performance on 222 neuroblastoma samples with matched normals and focus on a well-studied rare somatic CNV within the ATRX gene. We show that the cross-sample normalization procedure of CODEX removes more noise than normalizing the tumor against the matched normal and that the segmentation procedure performs well in detecting CNVs with nested structures.

摘要

DNA编码区的高通量测序已成为人类疾病研究中检测基因组变异的常用方法。拷贝数变异(CNV)是一种重要的基因组变异类型,但由于偏差和伪影水平较高,从外显子组测序中检测和表征CNV具有挑战性。我们提出了CODEX,一种用于全外显子组测序数据的标准化和CNV检测程序。CODEX中的泊松潜在因子模型包含专门消除由于GC含量、外显子捕获和扩增效率以及潜在系统伪影导致的偏差的项。CODEX还包括一个基于泊松似然的递归分割程序,该程序明确地对基于计数的外显子组测序数据进行建模。在对来自千人基因组计划的HapMap样本进行群体分析时,将CODEX与现有方法进行了比较,结果表明在三个基于微阵列的验证数据集上,CODEX更为准确。我们进一步评估了在222个有匹配正常样本的神经母细胞瘤样本上的性能,并重点研究了ATRX基因内一个经过充分研究的罕见体细胞CNV。我们表明,CODEX的跨样本标准化程序比将肿瘤与匹配的正常样本进行标准化能去除更多噪声,并且分割程序在检测具有嵌套结构的CNV方面表现良好。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6de/4381046/f4093e043a72/gku1363fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6de/4381046/5b4a1c24fff3/gku1363fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6de/4381046/9cf0ed2df788/gku1363fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6de/4381046/645f2c822510/gku1363fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6de/4381046/c46df5fb3bb4/gku1363fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6de/4381046/ad16ea888f7d/gku1363fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6de/4381046/e54f636b2ef6/gku1363fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6de/4381046/f4093e043a72/gku1363fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6de/4381046/5b4a1c24fff3/gku1363fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6de/4381046/9cf0ed2df788/gku1363fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6de/4381046/645f2c822510/gku1363fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6de/4381046/c46df5fb3bb4/gku1363fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6de/4381046/ad16ea888f7d/gku1363fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6de/4381046/e54f636b2ef6/gku1363fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6de/4381046/f4093e043a72/gku1363fig7.jpg

相似文献

1
CODEX: a normalization and copy number variation detection method for whole exome sequencing.CODEX:一种用于全外显子组测序的标准化及拷贝数变异检测方法。
Nucleic Acids Res. 2015 Mar 31;43(6):e39. doi: 10.1093/nar/gku1363. Epub 2015 Jan 23.
2
An evaluation of copy number variation detection tools for cancer using whole exome sequencing data.使用全外显子组测序数据对癌症拷贝数变异检测工具的评估
BMC Bioinformatics. 2017 May 31;18(1):286. doi: 10.1186/s12859-017-1705-x.
3
CNVind: an open source cloud-based pipeline for rare CNVs detection in whole exome sequencing data based on the depth of coverage.CNVind:一个基于覆盖深度的全外显子测序数据中罕见 CNVs 检测的开源云端分析流程。
BMC Bioinformatics. 2022 Mar 5;23(1):85. doi: 10.1186/s12859-022-04617-x.
4
Evaluation of somatic copy number estimation tools for whole-exome sequencing data.全外显子组测序数据的体细胞拷贝数估计工具评估
Brief Bioinform. 2016 Mar;17(2):185-92. doi: 10.1093/bib/bbv055. Epub 2015 Jul 25.
5
Exome sequence read depth methods for identifying copy number changes.用于识别拷贝数变化的外显子序列读取深度方法。
Brief Bioinform. 2015 May;16(3):380-92. doi: 10.1093/bib/bbu027. Epub 2014 Aug 28.
6
Comparative analysis of methods for identifying somatic copy number alterations from deep sequencing data.从深度测序数据中识别体细胞拷贝数改变的方法的比较分析。
Brief Bioinform. 2015 Mar;16(2):242-54. doi: 10.1093/bib/bbu004. Epub 2014 Mar 5.
7
Determining multiallelic complex copy number and sequence variation from high coverage exome sequencing data.从高覆盖度外显子组测序数据中确定多等位基因复合体拷贝数和序列变异
BMC Genomics. 2015 Nov 2;16:891. doi: 10.1186/s12864-015-2123-y.
8
Combinatorial approach to estimate copy number genotype using whole-exome sequencing data.利用全外显子组测序数据估计拷贝数基因型的组合方法。
Genomics. 2015 Mar;105(3):145-9. doi: 10.1016/j.ygeno.2014.12.003. Epub 2014 Dec 20.
9
Noise cancellation using total variation for copy number variation detection.利用全变差降噪进行拷贝数变异检测。
BMC Bioinformatics. 2018 Oct 22;19(Suppl 11):361. doi: 10.1186/s12859-018-2332-x.
10
Assessing the reproducibility of exome copy number variations predictions.评估外显子拷贝数变异预测的可重复性。
Genome Med. 2016 Aug 8;8(1):82. doi: 10.1186/s13073-016-0336-6.

引用本文的文献

1
EMcnv: enhancing CNV detection performance through ensemble strategies with heterogeneous meta-graph neural networks.EMcnv:通过使用异构元图神经网络的集成策略提高拷贝数变异(CNV)检测性能。
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf135.
2
Diagnostic Utility of Trio-Exome Sequencing for Children With Neurodevelopmental Disorders.三联外显子测序对神经发育障碍儿童的诊断效用
JAMA Netw Open. 2025 Mar 3;8(3):e251807. doi: 10.1001/jamanetworkopen.2025.1807.
3
CopyVAE: a variational autoencoder-based approach for copy number variation inference using single-cell transcriptomics.

本文引用的文献

1
EXCAVATOR: detecting copy number variants from whole-exome sequencing data.挖掘者:从全外显子组测序数据中检测拷贝数变异
Genome Biol. 2013;14(10):R120. doi: 10.1186/gb-2013-14-10-r120.
2
Poisson factor models with applications to non-normalized microRNA profiling.泊松因子模型及其在非标准化 microRNA 分析中的应用。
Bioinformatics. 2013 May 1;29(9):1105-11. doi: 10.1093/bioinformatics/btt091. Epub 2013 Feb 21.
3
CoNVEX: copy number variation estimation in exome sequencing data using HMM.CoNVEX:使用 HMM 进行外显子测序数据中的拷贝数变异估计。
CopyVAE:一种基于变分自动编码器的方法,用于使用单细胞转录组学推断拷贝数变异。
Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae284.
4
Eukaryotic translation initiation factor p.Ser637Cys mutation in a family with Parkinson's disease with antecedent essential tremor.帕金森病伴原发性震颤家族中的真核生物翻译起始因子p.Ser637Cys突变
Exp Ther Med. 2024 Mar 19;27(5):206. doi: 10.3892/etm.2024.12494. eCollection 2024 May.
5
labelSeg: segment annotation for tumor copy number alteration profiles.labelSeg:肿瘤拷贝数改变谱的标注。
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbad541.
6
HMZDupFinder: a robust computational approach for detecting intragenic homozygous duplications from exome sequencing data.HMZDupFinder:一种从外显子组测序数据中检测基因内纯合重复的强大计算方法。
Nucleic Acids Res. 2024 Feb 28;52(4):e18. doi: 10.1093/nar/gkad1223.
7
GATK-gCNV enables the discovery of rare copy number variants from exome sequencing data.GATK-gCNV 可从外显子测序数据中发现罕见的拷贝数变异。
Nat Genet. 2023 Sep;55(9):1589-1597. doi: 10.1038/s41588-023-01449-0. Epub 2023 Aug 21.
8
Revisiting Genetic Epidemiology with a Refined Targeted Gene Panel for Hereditary Hearing Impairment in the Taiwanese Population.重新审视具有台湾人群遗传性听力障碍精细靶向基因面板的遗传流行病学。
Genes (Basel). 2023 Apr 7;14(4):880. doi: 10.3390/genes14040880.
9
A novel missense mutation in causes hereditary spastic paraplegia in male members of a family: A case report.一个新的错义突变导致家族中男性成员遗传性痉挛性截瘫:病例报告。
Mol Med Rep. 2023 Apr;27(4). doi: 10.3892/mmr.2023.12966. Epub 2023 Feb 24.
10
Accurate in silico confirmation of rare copy number variant calls from exome sequencing data using transfer learning.利用迁移学习准确地从外显子组测序数据中确认罕见拷贝数变异的调用。
Nucleic Acids Res. 2022 Nov 28;50(21):e123. doi: 10.1093/nar/gkac788.
BMC Bioinformatics. 2013;14 Suppl 2(Suppl 2):S2. doi: 10.1186/1471-2105-14-S2-S2. Epub 2013 Jan 21.
4
The genetic landscape of high-risk neuroblastoma.高危神经母细胞瘤的遗传特征。
Nat Genet. 2013 Mar;45(3):279-84. doi: 10.1038/ng.2529. Epub 2013 Jan 20.
5
Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth.全外显子测序深度数据中拷贝数变异的发现和统计基因分型。
Am J Hum Genet. 2012 Oct 5;91(4):597-607. doi: 10.1016/j.ajhg.2012.08.005.
6
An exome sequencing pipeline for identifying and genotyping common CNVs associated with disease with application to psoriasis.一个用于识别和基因分型与疾病相关的常见 CNVs 的外显子组测序管道,应用于银屑病。
Bioinformatics. 2012 Sep 15;28(18):i370-i374. doi: 10.1093/bioinformatics/bts379.
7
A robust model for read count data in exome sequencing experiments and implications for copy number variant calling.外显子测序实验中读取计数数据的稳健模型及其对拷贝数变异calling 的影响。
Bioinformatics. 2012 Nov 1;28(21):2747-54. doi: 10.1093/bioinformatics/bts526. Epub 2012 Aug 31.
8
Detecting simultaneous changepoints in multiple sequences.检测多个序列中的同时变化点。
Biometrika. 2010 Sep;97(3):631-645. doi: 10.1093/biomet/asq025. Epub 2010 Jun 16.
9
Copy number variation detection and genotyping from exome sequence data.外显子组序列数据中的拷贝数变异检测和基因分型。
Genome Res. 2012 Aug;22(8):1525-32. doi: 10.1101/gr.138115.112. Epub 2012 May 14.
10
De novo mutations revealed by whole-exome sequencing are strongly associated with autism.全外显子组测序揭示的新生突变与自闭症强烈相关。
Nature. 2012 Apr 4;485(7397):237-41. doi: 10.1038/nature10945.