Suppr超能文献

高效的连续测量基因组序列的突变点检测。

Efficient change point detection for genomic sequences of continuous measurements.

机构信息

Dipartimento di Scienze Statistiche e Matematiche Vianelli, Università di Palermo, Palermo, Italy.

出版信息

Bioinformatics. 2011 Jan 15;27(2):161-6. doi: 10.1093/bioinformatics/btq647. Epub 2010 Nov 18.

Abstract

MOTIVATION

Knowing the exact locations of multiple change points in genomic sequences serves several biological needs, for instance when data represent aCGH profiles and it is of interest to identify possibly damaged genes involved in cancer and other diseases. Only a few of the currently available methods deal explicitly with estimation of the number and location of change points, and moreover these methods may be somewhat vulnerable to deviations of model assumptions usually employed.

RESULTS

We present a computationally efficient method to obtain estimates of the number and location of the change points. The method is based on a simple transformation of data and it provides results quite robust to model misspecifications. The efficiency of the method guarantees moderate computational times regardless of the series length and the number of change points.

AVAILABILITY

The methods described in this article are implemented in the new R package cumSeg available from the Comprehensive R Archive Network at http://CRAN.R-project.org/package=cumSeg.

摘要

动机

在基因组序列中准确地定位多个变化点,可满足多种生物学需求,例如当数据代表 aCGH 图谱,且需要鉴定可能与癌症和其他疾病相关的受损基因时。目前可用的少数方法明确涉及变化点数量和位置的估计,此外,这些方法可能对通常使用的模型假设的偏差有些敏感。

结果

我们提出了一种计算上有效的方法来获得变化点数量和位置的估计值。该方法基于数据的简单变换,并且对模型的误设具有相当稳健的结果。该方法的效率保证了即使在序列长度和变化点数量较大的情况下,计算时间也适中。

可用性

本文中描述的方法在新的 R 包 cumSeg 中实现,可从 Comprehensive R Archive Network 获取,网址为 http://CRAN.R-project.org/package=cumSeg。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验