Zhu Hongxiao, Vannucci Marina, Cox Dennis D
Department of Biostatistics, University of Texas M. D. Anderson Cancer Center, Houston, Texas 77230, USA.
Biometrics. 2010 Jun;66(2):463-73. doi: 10.1111/j.1541-0420.2009.01283.x. Epub 2009 Jun 9.
In functional data classification, functional observations are often contaminated by various systematic effects, such as random batch effects caused by device artifacts, or fixed effects caused by sample-related factors. These effects may lead to classification bias and thus should not be neglected. Another issue of concern is the selection of functions when predictors consist of multiple functions, some of which may be redundant. The above issues arise in a real data application where we use fluorescence spectroscopy to detect cervical precancer. In this article, we propose a Bayesian hierarchical model that takes into account random batch effects and selects effective functions among multiple functional predictors. Fixed effects or predictors in nonfunctional form are also included in the model. The dimension of the functional data is reduced through orthonormal basis expansion or functional principal components. For posterior sampling, we use a hybrid Metropolis-Hastings/Gibbs sampler, which suffers slow mixing. An evolutionary Monte Carlo algorithm is applied to improve the mixing. Simulation and real data application show that the proposed model provides accurate selection of functional predictors as well as good classification.
在功能数据分类中,功能观测值常常受到各种系统效应的影响,例如由设备伪迹引起的随机批次效应,或由样本相关因素引起的固定效应。这些效应可能导致分类偏差,因此不容忽视。另一个值得关注的问题是当预测变量由多个函数组成时函数的选择,其中一些函数可能是冗余的。上述问题出现在我们使用荧光光谱检测宫颈上皮内瘤变的实际数据应用中。在本文中,我们提出了一种贝叶斯分层模型,该模型考虑了随机批次效应,并在多个功能预测变量中选择有效函数。非功能形式的固定效应或预测变量也包含在模型中。通过正交基展开或功能主成分来降低功能数据的维度。对于后验抽样,我们使用混合Metropolis-Hastings/Gibbs抽样器,其混合速度较慢。应用进化蒙特卡罗算法来改善混合效果。模拟和实际数据应用表明,所提出的模型能够准确选择功能预测变量并实现良好的分类。