Xing Baifang, Greenwood Celia M T, Bull Shelley B
Genetics and Genome Biology, Hospital for Sick Children, Toronto, Ontario, Canada.
Biostatistics. 2007 Jul;8(3):632-53. doi: 10.1093/biostatistics/kxl035. Epub 2006 Oct 23.
Microarray technologies allow for simultaneous measurement of DNA copy number at thousands of positions in a genome. Gains and losses of DNA sequences reveal themselves through characteristic patterns of hybridization intensity. To identify change points along the chromosomes, we develop a marker clustering method which consists of 2 parts. First, a "circular clustering tree test statistic" attaches a statistic to each marker that measures the likelihood that it is a change point. Then construction of the marker statistics is followed by outlier detection approaches. The method provides a new way to build up a binary tree that can accurately capture change-point signals and is easy to perform. A simulation study shows good performance in change-point detection, and cancer cell line data are used to illustrate performance when regions of true copy number changes are known.
微阵列技术允许同时测量基因组中数千个位置的DNA拷贝数。DNA序列的增加和减少通过杂交强度的特征模式显现出来。为了识别染色体上的变化点,我们开发了一种标记聚类方法,该方法由两部分组成。首先,一个“循环聚类树检验统计量”为每个标记赋予一个统计量,该统计量衡量其成为变化点的可能性。然后,在构建标记统计量之后采用异常值检测方法。该方法提供了一种构建二叉树的新方法,该二叉树可以准确捕获变化点信号并且易于执行。一项模拟研究表明该方法在变化点检测方面具有良好的性能,并且当已知真实拷贝数变化区域时,使用癌细胞系数据来说明其性能。