Li Yuliang, Bandyopadhyay Dipankar, Xie Fangzheng, Xu Yanxun
Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, Maryland, USA.
Department of Biostatistics, Virginia Commonwealth University, Richmond, Virginia, USA.
Stat Med. 2020 Jul 20;39(16):2139-2151. doi: 10.1002/sim.8536. Epub 2020 Apr 3.
Preventing periodontal diseases (PD) and maintaining the structure and function of teeth are important goals for personal oral care. To understand the heterogeneity in patients with diverse PD patterns, we develop a Bayesian repulsive biclustering method that can simultaneously cluster the PD patients and their tooth sites after taking the patient- and site-level covariates into consideration. BAREB uses the determinantal point process prior to induce diversity among different biclusters to facilitate parsimony and interpretability. Since PD progression is hypothesized to be spatially referenced, BAREB factors in the spatial dependence among tooth sites. In addition, since PD is the leading cause for tooth loss, the missing data mechanism is nonignorable. Such nonrandom missingness is incorporated into BAREB. For the posterior inference, we design an efficient reversible jump Markov chain Monte Carlo sampler. Simulation studies show that BAREB is able to accurately estimate the biclusters, and compares favorably to alternatives. For real world application, we apply BAREB to a dataset from a clinical PD study, and obtain desirable and interpretable results. A major contribution of this article is the Rcpp implementation of our methodology, available in the R package BAREB.
预防牙周疾病(PD)并维持牙齿的结构和功能是个人口腔护理的重要目标。为了了解具有不同PD模式的患者的异质性,我们开发了一种贝叶斯排斥双聚类方法,该方法在考虑患者和部位水平的协变量后,可以同时对PD患者及其牙齿部位进行聚类。BAREB使用行列式点过程先验来诱导不同双聚类之间的多样性,以促进简约性和可解释性。由于假设PD进展在空间上是相关的,BAREB考虑了牙齿部位之间的空间依赖性。此外,由于PD是牙齿脱落的主要原因,缺失数据机制不可忽视。这种非随机缺失被纳入BAREB。对于后验推断,我们设计了一种高效的可逆跳跃马尔可夫链蒙特卡罗采样器。模拟研究表明,BAREB能够准确估计双聚类,并且与其他方法相比具有优势。对于实际应用,我们将BAREB应用于一项临床PD研究的数据集,并获得了理想且可解释的结果。本文的一个主要贡献是我们方法的Rcpp实现,可在R包BAREB中获得。