Department of Computer Science, University of Basel, Bernoullistr, 16, CH-4056 Basel, Switzerland.
BMC Bioinformatics. 2010 Oct 26;11 Suppl 8(Suppl 8):S8. doi: 10.1186/1471-2105-11-S8-S8.
We present an infinite mixture-of-experts model to find an unknown number of sub-groups within a given patient cohort based on survival analysis. The effect of patient features on survival is modeled using the Cox's proportionality hazards model which yields a non-standard regression component. The model is able to find key explanatory factors (chosen from main effects and higher-order interactions) for each sub-group by enforcing sparsity on the regression coefficients via the Bayesian Group-Lasso.
Simulated examples justify the need of such an elaborate framework for identifying sub-groups along with their key characteristics versus other simpler models. When applied to a breast-cancer dataset consisting of survival times and protein expression levels of patients, it results in identifying two distinct sub-groups with different survival patterns (low-risk and high-risk) along with the respective sets of compound markers.
The unified framework presented here, combining elements of cluster and feature detection for survival analysis, is clearly a powerful tool for analyzing survival patterns within a patient group. The model also demonstrates the feasibility of analyzing complex interactions which can contribute to definition of novel prognostic compound markers.
我们提出了一个无限混合专家模型,以便根据生存分析在给定的患者队列中找到未知数量的亚组。使用 Cox 比例风险模型对患者特征对生存的影响进行建模,该模型产生了一个非标准的回归分量。该模型通过在回归系数上施加贝叶斯组稀疏惩罚(Bayesian Group-Lasso)来找到每个亚组的关键解释因素(从主效应和高阶交互作用中选择)。
模拟示例证明了这种精细框架的必要性,它可以与其他更简单的模型一起识别亚组及其关键特征。当应用于包含患者生存时间和蛋白质表达水平的乳腺癌数据集时,它可以识别出两种不同的生存模式(低风险和高风险)的亚组,以及各自的复合标志物集。
本文提出的统一框架结合了生存分析中聚类和特征检测的元素,显然是分析患者群体中生存模式的有力工具。该模型还证明了分析复杂相互作用的可行性,这有助于定义新的预后复合标志物。