Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 4362 Esch-sur-Alzette, Luxembourg.
Department of Epidemiology and Data Science, Amsterdam UMC, 1081 HV Amsterdam, The Netherlands.
Bioinformatics. 2021 Aug 4;37(14):2012-2016. doi: 10.1093/bioinformatics/btaa535.
Machine learning in the biomedical sciences should ideally provide predictive and interpretable models. When predicting outcomes from clinical or molecular features, applied researchers often want to know which features have effects, whether these effects are positive or negative and how strong these effects are. Regression analysis includes this information in the coefficients but typically renders less predictive models than more advanced machine learning techniques.
Here, we propose an interpretable meta-learning approach for high-dimensional regression. The elastic net provides a compromise between estimating weak effects for many features and strong effects for some features. It has a mixing parameter to weight between ridge and lasso regularization. Instead of selecting one weighting by tuning, we combine multiple weightings by stacking. We do this in a way that increases predictivity without sacrificing interpretability.
The R package starnet is available on GitHub (https://github.com/rauschenberger/starnet) and CRAN (https://CRAN.R-project.org/package=starnet).
机器学习在生物医学科学中的应用理想情况下应提供具有预测能力且可解释的模型。在基于临床或分子特征预测结果时,应用研究人员通常希望了解哪些特征具有影响,这些影响是正向的还是负向的,以及这些影响的强度如何。回归分析将这些信息包含在系数中,但通常会生成不如更先进的机器学习技术那样具有预测能力的模型。
在这里,我们提出了一种用于高维回归的可解释元学习方法。弹性网络在估计许多特征的弱影响和一些特征的强影响之间提供了一种折衷。它有一个混合参数,用于在岭回归和套索正则化之间进行加权。我们不是通过调整来选择一个权重,而是通过堆叠来组合多个权重。我们以一种在不牺牲可解释性的情况下提高预测能力的方式来做到这一点。
R 包 starnet 可在 GitHub(https://github.com/rauschenberger/starnet)和 CRAN(https://CRAN.R-project.org/package=starnet)上获得。