Nikooienejad Amir, Wang Wenyi, Johnson Valen E
Texas A&M University.
MD Anderson Cancer Center.
Ann Appl Stat. 2020 Jun;14(2):809-828. doi: 10.1214/20-AOAS1325. Epub 2020 Jun 29.
Efficient variable selection in high dimensional cancer genomic studies is critical for discovering genes associated with specific cancer types and for predicting response to treatment. Censored survival data is prevalent in such studies. In this article we introduce a Bayesian variable selection procedure that uses a mixture prior composed of a point mass at zero and an inverse moment prior in conjunction with the partial likelihood defined by the Cox proportional hazard model. The procedure is implemented in the R package BVSNLP, which supports parallel computing and uses a stochastic search method to explore the model space. Bayesian model averaging is used for prediction. The proposed algorithm provides better performance than other variable selection procedures in simulation studies, and appears to provide more consistent variable selection when applied to actual genomic datasets.
在高维癌症基因组研究中进行有效的变量选择对于发现与特定癌症类型相关的基因以及预测治疗反应至关重要。在这类研究中,删失生存数据很常见。在本文中,我们介绍了一种贝叶斯变量选择程序,该程序使用由零点处的点质量和逆矩先验组成的混合先验,并结合Cox比例风险模型定义的偏似然。该程序在R包BVSNLP中实现,它支持并行计算,并使用随机搜索方法来探索模型空间。贝叶斯模型平均用于预测。在模拟研究中,所提出的算法比其他变量选择程序具有更好的性能,并且在应用于实际基因组数据集时似乎能提供更一致的变量选择。