Wan Y, Datta S, Conklin D J, Kong M
Department of Bioinformatics and Biostatistics, University of Louisville, Louisville, KY, USA.
Division of Cardiovascular Medicine, Department of Medicine, University of Louisville, Louisville, KY, USA.
J Stat Comput Simul. 2015;85(9):1902-1916. doi: 10.1080/00949655.2014.907801.
The statistical methods for variable selection and prediction could be challenging when missing covariates exist. Although multiple imputation (MI) is a universally accepted technique for solving missing data problem, how to combine the MI results for variable selection is not quite clear, because different imputations may result in different selections. The widely applied variable selection methods include the sparse partial least-squares (SPLS) method and the penalized least-squares method, e.g. the elastic net (ENet) method. In this paper, we propose an MI-based weighted elastic net (MI-WENet) method that is based on stacked MI data and a weighting scheme for each observation in the stacked data set. In the MI-WENet method, MI accounts for sampling and imputation uncertainty for missing values, and the weight accounts for the observed information. Extensive numerical simulations are carried out to compare the proposed MI-WENet method with the other competing alternatives, such as the SPLS and ENet. In addition, we applied the MIWENet method to examine the predictor variables for the endothelial function that can be characterized by median effective dose (ED50) and maximum effect (Emax) in an ex-vivo phenylephrine-induced extension and acetylcholine-induced relaxation experiment.
当存在协变量缺失时,用于变量选择和预测的统计方法可能具有挑战性。尽管多重填补(MI)是解决缺失数据问题的一种普遍接受的技术,但如何将MI结果用于变量选择尚不完全清楚,因为不同的填补可能会导致不同的选择。广泛应用的变量选择方法包括稀疏偏最小二乘法(SPLS)和惩罚最小二乘法,例如弹性网络(ENet)法。在本文中,我们提出了一种基于MI的加权弹性网络(MI-WENet)方法,该方法基于堆叠的MI数据以及针对堆叠数据集中每个观测值的加权方案。在MI-WENet方法中,MI考虑了缺失值的抽样和填补不确定性,而权重则考虑了观测信息。我们进行了广泛的数值模拟,以将所提出的MI-WENet方法与其他竞争方法,如SPLS和ENet进行比较。此外,我们应用MIWENet方法来检验在体外去氧肾上腺素诱导的伸展和乙酰胆碱诱导的舒张实验中,可用半数有效剂量(ED50)和最大效应(Emax)表征的内皮功能的预测变量。