Suppr超能文献

CNV-CH:一种基于凸包的分割方法,用于使用下一代测序数据检测拷贝数变异(CNV)。

CNV-CH: A Convex Hull Based Segmentation Approach to Detect Copy Number Variations (CNV) Using Next-Generation Sequencing Data.

作者信息

Sinha Rituparna, Samaddar Sandip, De Rajat K

机构信息

Department of Information Technology, Heritage Institute Of Technology, Kolkata, West Bengal, India.

Department of Computer Science and Engineering, Heritage Institute Of Technology, Kolkata, West Bengal, India.

出版信息

PLoS One. 2015 Aug 20;10(8):e0135895. doi: 10.1371/journal.pone.0135895. eCollection 2015.

Abstract

Copy number variation (CNV) is a form of structural alteration in the mammalian DNA sequence, which are associated with many complex neurological diseases as well as cancer. The development of next generation sequencing (NGS) technology provides us a new dimension towards detection of genomic locations with copy number variations. Here we develop an algorithm for detecting CNVs, which is based on depth of coverage data generated by NGS technology. In this work, we have used a novel way to represent the read count data as a two dimensional geometrical point. A key aspect of detecting the regions with CNVs, is to devise a proper segmentation algorithm that will distinguish the genomic locations having a significant difference in read count data. We have designed a new segmentation approach in this context, using convex hull algorithm on the geometrical representation of read count data. To our knowledge, most algorithms have used a single distribution model of read count data, but here in our approach, we have considered the read count data to follow two different distribution models independently, which adds to the robustness of detection of CNVs. In addition, our algorithm calls CNVs based on the multiple sample analysis approach resulting in a low false discovery rate with high precision.

摘要

拷贝数变异(CNV)是哺乳动物DNA序列结构改变的一种形式,它与许多复杂的神经疾病以及癌症相关。新一代测序(NGS)技术的发展为我们检测存在拷贝数变异的基因组位置提供了一个新的维度。在此,我们开发了一种基于NGS技术生成的覆盖深度数据来检测CNV的算法。在这项工作中,我们采用了一种新颖的方式将读取计数数据表示为二维几何点。检测存在CNV的区域的一个关键方面是设计一种合适的分割算法,该算法将区分读取计数数据存在显著差异的基因组位置。在此背景下,我们使用凸包算法对读取计数数据的几何表示设计了一种新的分割方法。据我们所知,大多数算法使用读取计数数据的单一分布模型,但在我们的方法中,我们考虑读取计数数据独立遵循两种不同的分布模型,这增加了CNV检测的稳健性。此外,我们的算法基于多样本分析方法调用CNV,从而实现低错误发现率和高精度。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/27a8/4546278/8dbb4530bfb6/pone.0135895.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验