Yuan Ao, Chen Guanjie, Xiong Juan, He Wenqing, Rotimi Charles
National Human Genome Center, Howard University, Washington D.C. USA.
J Appl Stat. 2011;38(5):987-1005. doi: 10.1080/02664761003692449.
Gene copy number (GCN) changes are common characteristics of many genetic diseases. Comparative genomic hybridization (CGH) is a new technology widely used today to screen the GCN changes in mutant cells with high resolution genome-wide. Statistical methods for analyzing such CGH data have been evolving. Existing methods are either frequentist's, or full Bayesian. The former often has computational advantage, while the latter can incorporate prior information into the model, but could be misleading when one does not have sound prior information. In an attempt to take full advantages of both approaches, we develop a Bayesian-frequentist hybrid approach, in which a subset of the model parameters is inferred by the Bayesian method, while the rest parameters by the frequentist's. This new hybrid approach provides advantages over those of the Bayesian or frequentist's method used alone. This is especially the case when sound prior information is available on part of the parameters, and the sample size is relatively small. Spatial dependence and false discovery rate are also discussed, and the parameter estimation is efficient. As an illustration, we used the proposed hybrid approach to analyze a real CGH data.
基因拷贝数(GCN)变化是许多遗传疾病的常见特征。比较基因组杂交(CGH)是当今广泛使用的一项新技术,用于在全基因组范围内以高分辨率筛选突变细胞中的GCN变化。分析此类CGH数据的统计方法一直在不断发展。现有方法要么是频率学派的,要么是完全贝叶斯方法。前者通常具有计算优势,而后者可以将先验信息纳入模型,但当没有可靠的先验信息时可能会产生误导。为了充分利用这两种方法的优势,我们开发了一种贝叶斯 - 频率学派混合方法,其中一部分模型参数通过贝叶斯方法推断,而其余参数通过频率学派方法推断。这种新的混合方法比单独使用贝叶斯或频率学派方法具有优势。当部分参数有可靠的先验信息且样本量相对较小时,情况尤其如此。还讨论了空间依赖性和错误发现率,并且参数估计是有效的。作为示例,我们使用所提出的混合方法分析了一个真实的CGH数据。