Xu Yanxun, Müller Peter, Telesca Donatello
Department of Statistics and Data Sciences, The University of Texas at Austin, Austin, Texas, U.S.A..
Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, Maryland, U.S.A..
Biometrics. 2016 Sep;72(3):955-64. doi: 10.1111/biom.12482. Epub 2016 Feb 12.
We discuss the use of the determinantal point process (DPP) as a prior for latent structure in biomedical applications, where inference often centers on the interpretation of latent features as biologically or clinically meaningful structure. Typical examples include mixture models, when the terms of the mixture are meant to represent clinically meaningful subpopulations (of patients, genes, etc.). Another class of examples are feature allocation models. We propose the DPP prior as a repulsive prior on latent mixture components in the first example, and as prior on feature-specific parameters in the second case. We argue that the DPP is in general an attractive prior model for latent structure when biologically relevant interpretation of such structure is desired. We illustrate the advantages of DPP prior in three case studies, including inference in mixture models for magnetic resonance images (MRI) and for protein expression, and a feature allocation model for gene expression using data from The Cancer Genome Atlas. An important part of our argument are efficient and straightforward posterior simulation methods. We implement a variation of reversible jump Markov chain Monte Carlo simulation for inference under the DPP prior, using a density with respect to the unit rate Poisson process.
我们讨论了行列式点过程(DPP)作为生物医学应用中潜在结构的先验分布的使用情况,在这些应用中,推理通常集中于将潜在特征解释为具有生物学或临床意义的结构。典型的例子包括混合模型,其中混合项旨在表示具有临床意义的亚群(患者、基因等)。另一类例子是特征分配模型。在第一个例子中,我们提出将DPP先验作为潜在混合成分上的排斥先验;在第二种情况下,作为特征特定参数上的先验。我们认为,当需要对这种结构进行生物学相关解释时,DPP总体上是一种有吸引力的潜在结构先验模型。我们在三个案例研究中说明了DPP先验的优势,包括对磁共振图像(MRI)和蛋白质表达的混合模型进行推理,以及使用来自癌症基因组图谱的数据对基因表达进行特征分配模型。我们论证的一个重要部分是高效且直接的后验模拟方法。我们实现了一种可逆跳跃马尔可夫链蒙特卡罗模拟的变体,用于在DPP先验下进行推理,使用相对于单位速率泊松过程的密度。