Suppr超能文献

深度学习方法在下一代测序数据中检测拷贝数变异。

A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing Data.

机构信息

4055 Haworth Hall, The Department of Molecular Biosciences, University of Kansas, 1200 Sunnyside Avenue, Lawrence, KS 66045.

出版信息

G3 (Bethesda). 2019 Nov 5;9(11):3575-3582. doi: 10.1534/g3.119.400596.

Abstract

Copy number variants (CNV) are associated with phenotypic variation in several species. However, properly detecting changes in copy numbers of sequences remains a difficult problem, especially in lower quality or lower coverage next-generation sequencing data. Here, inspired by recent applications of machine learning in genomics, we describe a method to detect duplications and deletions in short-read sequencing data. In low coverage data, machine learning appears to be more powerful in the detection of CNVs than the gold-standard methods of coverage estimation alone, and of equal power in high coverage data. We also demonstrate how replicating training sets allows a more precise detection of CNVs, even identifying novel CNVs in two genomes previously surveyed thoroughly for CNVs using long read data.

摘要

拷贝数变异 (CNV) 与多个物种的表型变异有关。然而,正确检测序列拷贝数的变化仍然是一个难题,特别是在质量较低或覆盖度较低的下一代测序数据中。在这里,受最近机器学习在基因组学中的应用的启发,我们描述了一种用于检测短读测序数据中重复和缺失的方法。在低覆盖度数据中,机器学习在检测 CNV 方面似乎比单独使用覆盖度估计的金标准方法更强大,而在高覆盖度数据中则具有同等的能力。我们还展示了如何复制训练集,即使使用长读数据对两个基因组进行了全面的 CNV 调查,也可以更精确地检测 CNV,甚至可以识别新的 CNV。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa5/6829143/d471a983c437/3575f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验