Percival Daniel, Roeder Kathryn, Rosenfeld Roni, Wasserman Larry
Carnegie Mellon University, Department of Statistics, Pittsburgh, PA 15213 USA,
Ann Appl Stat. 2011 Jun 1;5(2A):628-644. doi: 10.1214/10-AOAS428.
We introduce a new version of forward stepwise regression. Our modification finds solutions to regression problems where the selected predictors appear in a structured pattern, with respect to a predefined distance measure over the candidate predictors. Our method is motivated by the problem of predicting HIV-1 drug resistance from protein sequences. We find that our method improves the interpretability of drug resistance while producing comparable predictive accuracy to standard methods. We also demonstrate our method in a simulation study and present some theoretical results and connection.
我们引入了一种新版本的前向逐步回归。我们的改进找到了回归问题的解决方案,其中所选预测变量以结构化模式出现,这是相对于候选预测变量上的预定义距离度量而言的。我们的方法是由从蛋白质序列预测HIV-1耐药性的问题所推动的。我们发现,我们的方法在提高耐药性可解释性的同时,产生的预测准确性与标准方法相当。我们还在模拟研究中展示了我们的方法,并给出了一些理论结果和联系。