在外显子组测序数据中用于拷贝数变异检测的读段计数建模。

Modeling read counts for CNV detection in exome sequencing data.

作者信息

Love Michael I, Myšičková Alena, Sun Ruping, Kalscheuer Vera, Vingron Martin, Haas Stefan A

机构信息

Max Planck Institute for Molecular Genetics.

出版信息

Stat Appl Genet Mol Biol. 2011 Nov 8;10(1):/j/sagmb.2011.10.issue-1/1544-6115.1732/1544-6115.1732.xml. doi: 10.2202/1544-6115.1732.

DOI:10.2202/1544-6115.1732

PMID:23089826

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3517018/

Abstract

Varying depth of high-throughput sequencing reads along a chromosome makes it possible to observe copy number variants (CNVs) in a sample relative to a reference. In exome and other targeted sequencing projects, technical factors increase variation in read depth while reducing the number of observed locations, adding difficulty to the problem of identifying CNVs. We present a hidden Markov model for detecting CNVs from raw read count data, using background read depth from a control set as well as other positional covariates such as GC-content. The model, exomeCopy, is applied to a large chromosome X exome sequencing project identifying a list of large unique CNVs. CNVs predicted by the model and experimentally validated are then recovered using a cross-platform control set from publicly available exome sequencing data. Simulations show high sensitivity for detecting heterozygous and homozygous CNVs, outperforming normalization and state-of-the-art segmentation methods.

摘要

沿染色体高通量测序读数深度的变化使得在样本中相对于参考观察拷贝数变异（CNV）成为可能。在外显子组和其他靶向测序项目中，技术因素增加了读数深度的变异性，同时减少了观察到的位置数量，给识别CNV的问题增加了难度。我们提出了一种隐马尔可夫模型，用于从原始读数计数数据中检测CNV，使用来自对照组的背景读数深度以及其他位置协变量，如GC含量。该模型exomeCopy应用于一个大型X染色体外显子组测序项目，识别出一系列大型独特的CNV。然后使用来自公开可用外显子组测序数据的跨平台对照组，回收由该模型预测并经实验验证的CNV。模拟结果表明，该模型在检测杂合和纯合CNV方面具有高灵敏度，优于归一化方法和最先进的分割方法。

相似文献

Modeling read counts for CNV detection in exome sequencing data.

Stat Appl Genet Mol Biol. 2011 Nov 8;10(1):/j/sagmb.2011.10.issue-1/1544-6115.1732/1544-6115.1732.xml. doi: 10.2202/1544-6115.1732.

Exome sequence read depth methods for identifying copy number changes.

Brief Bioinform. 2015 May;16(3):380-92. doi: 10.1093/bib/bbu027. Epub 2014 Aug 28.

CNVind: an open source cloud-based pipeline for rare CNVs detection in whole exome sequencing data based on the depth of coverage.

BMC Bioinformatics. 2022 Mar 5;23(1):85. doi: 10.1186/s12859-022-04617-x.

Assessing the reproducibility of exome copy number variations predictions.

Genome Med. 2016 Aug 8;8(1):82. doi: 10.1186/s13073-016-0336-6.

Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth.

Am J Hum Genet. 2012 Oct 5;91(4):597-607. doi: 10.1016/j.ajhg.2012.08.005.

CODEX: a normalization and copy number variation detection method for whole exome sequencing.

Nucleic Acids Res. 2015 Mar 31;43(6):e39. doi: 10.1093/nar/gku1363. Epub 2015 Jan 23.

Evaluation of somatic copy number estimation tools for whole-exome sequencing data.

Brief Bioinform. 2016 Mar;17(2):185-92. doi: 10.1093/bib/bbv055. Epub 2015 Jul 25.

Detecting copy-number variations in whole-exome sequencing data using the eXome Hidden Markov Model: an 'exome-first' approach.

J Hum Genet. 2015 Apr;60(4):175-82. doi: 10.1038/jhg.2014.124. Epub 2015 Jan 22.

Allele-specific copy-number discovery from whole-genome and whole-exome sequencing.

Nucleic Acids Res. 2015 Aug 18;43(14):e90. doi: 10.1093/nar/gkv319. Epub 2015 Apr 16.

Comparison of kNN and k-means optimization methods of reference set selection for improved CNV callers performance.

BMC Bioinformatics. 2019 May 28;20(1):266. doi: 10.1186/s12859-019-2889-z.

引用本文的文献

Unravelling the Genetic Landscape of Hemiplegic Migraine: Exploring Innovative Strategies and Emerging Approaches.

Genes (Basel). 2024 Mar 31;15(4):443. doi: 10.3390/genes15040443.

Genetic interrogation for sequence and copy number variants in systemic lupus erythematosus.

Front Genet. 2024 Mar 4;15:1341272. doi: 10.3389/fgene.2024.1341272. eCollection 2024.

PKHD1L1, a gene involved in the stereocilia coat, causes autosomal recessive nonsyndromic hearing loss.

Hum Genet. 2024 Mar;143(3):311-329. doi: 10.1007/s00439-024-02649-2. Epub 2024 Mar 9.

, A Gene Involved in the Stereocilia Coat, Causes Autosomal Recessive Nonsyndromic Hearing Loss.

medRxiv. 2023 Dec 19:2023.10.08.23296081. doi: 10.1101/2023.10.08.23296081.

Bone marrow endosteal stem cells dictate active osteogenesis and aggressive tumorigenesis.

Nat Commun. 2023 Apr 25;14(1):2383. doi: 10.1038/s41467-023-38034-2.

Different Strategies for Counting the Depth of Coverage in Copy Number Variation Calling Tools.

Bioinform Biol Insights. 2022 Aug 3;16:11779322221115534. doi: 10.1177/11779322221115534. eCollection 2022.

Algorithmic improvements for discovery of germline copy number variants in next-generation sequencing data.

BMC Bioinformatics. 2022 Jul 19;23(1):285. doi: 10.1186/s12859-022-04820-w.

Whole-exome sequencing of Indian prostate cancer reveals a novel therapeutic target: POLQ.

J Cancer Res Clin Oncol. 2023 Jun;149(6):2451-2462. doi: 10.1007/s00432-022-04111-0. Epub 2022 Jun 23.

Progress in Methods for Copy Number Variation Profiling.

Int J Mol Sci. 2022 Feb 15;23(4):2143. doi: 10.3390/ijms23042143.

Benchmarking germline CNV calling tools from exome sequencing data.

Sci Rep. 2021 Jul 13;11(1):14416. doi: 10.1038/s41598-021-93878-2.

本文引用的文献

Exome sequencing-based copy-number variation and loss of heterozygosity detection: ExomeCNV.

Bioinformatics. 2011 Oct 1;27(19):2648-54. doi: 10.1093/bioinformatics/btr462. Epub 2011 Aug 9.

Exome sequencing in sporadic autism spectrum disorders identifies severe de novo mutations.

Nat Genet. 2011 Jun;43(6):585-9. doi: 10.1038/ng.835. Epub 2011 May 15.

Comparison of three targeted enrichment strategies on the SOLiD sequencing platform.

PLoS One. 2011 Apr 29;6(4):e18595. doi: 10.1371/journal.pone.0018595.

Accurate and exact CNV identification from targeted high-throughput sequence data.

BMC Genomics. 2011 Apr 12;12:184. doi: 10.1186/1471-2164-12-184.

ReadDepth: a parallel R package for detecting copy number alterations from short sequencing reads.

PLoS One. 2011 Jan 31;6(1):e16327. doi: 10.1371/journal.pone.0016327.

Control-free calling of copy number alterations in deep-sequencing data using GC-content normalization.

Bioinformatics. 2011 Jan 15;27(2):268-9. doi: 10.1093/bioinformatics/btq635. Epub 2010 Nov 15.

A map of human genome variation from population-scale sequencing.

Nature. 2010 Oct 28;467(7319):1061-73. doi: 10.1038/nature09534.

Differential expression analysis for sequence count data.

Genome Biol. 2010;11(10):R106. doi: 10.1186/gb-2010-11-10-r106. Epub 2010 Oct 27.

CNAseg--a novel framework for identification of copy number changes in cancer from second-generation sequencing data.

Bioinformatics. 2010 Dec 15;26(24):3051-8. doi: 10.1093/bioinformatics/btq587. Epub 2010 Oct 21.

Resequencing of 200 human exomes identifies an excess of low-frequency non-synonymous coding variants.

Nat Genet. 2010 Nov;42(11):969-72. doi: 10.1038/ng.680. Epub 2010 Oct 3.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

在外显子组测序数据中用于拷贝数变异检测的读段计数建模。

Modeling read counts for CNV detection in exome sequencing data.

作者信息

Love Michael I, Myšičková Alena, Sun Ruping, Kalscheuer Vera, Vingron Martin, Haas Stefan A

机构信息

Max Planck Institute for Molecular Genetics.

出版信息

Stat Appl Genet Mol Biol. 2011 Nov 8;10(1):/j/sagmb.2011.10.issue-1/1544-6115.1732/1544-6115.1732.xml. doi: 10.2202/1544-6115.1732.

DOI:10.2202/1544-6115.1732

PMID:23089826

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3517018/

Abstract

摘要

在外显子组测序数据中用于拷贝数变异检测的读段计数建模。

Modeling read counts for CNV detection in exome sequencing data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

在外显子组测序数据中用于拷贝数变异检测的读段计数建模。

Modeling read counts for CNV detection in exome sequencing data.

作者信息

机构信息

出版信息