Division of Biostatistics, University of California San Diego, La Jolla, California, USA.
IBM T. J. Watson Research Center, Yorktown Heights, New York, USA.
Biometrics. 2022 Sep;78(3):1155-1167. doi: 10.1111/biom.13481. Epub 2021 May 19.
Feature selection is indispensable in microbiome data analysis, but it can be particularly challenging as microbiome data sets are high dimensional, underdetermined, sparse and compositional. Great efforts have recently been made on developing new methods for feature selection that handle the above data characteristics, but almost all methods were evaluated based on performance of model predictions. However, little attention has been paid to address a fundamental question: how appropriate are those evaluation criteria? Most feature selection methods often control the model fit, but the ability to identify meaningful subsets of features cannot be evaluated simply based on the prediction accuracy. If tiny changes to the data would lead to large changes in the chosen feature subset, then many selected features are likely to be a data artifact rather than real biological signal. This crucial need of identifying relevant and reproducible features motivated the reproducibility evaluation criterion such as Stability, which quantifies how robust a method is to perturbations in the data. In our paper, we compare the performance of popular model prediction metrics (MSE or AUC) with proposed reproducibility criterion Stability in evaluating four widely used feature selection methods in both simulations and experimental microbiome applications with continuous or binary outcomes. We conclude that Stability is a preferred feature selection criterion over model prediction metrics because it better quantifies the reproducibility of the feature selection method.
特征选择在微生物组数据分析中不可或缺,但由于微生物组数据集具有高维、欠定、稀疏和组成性等特点,特征选择可能特别具有挑战性。最近,人们在开发新的特征选择方法方面做出了巨大努力,这些方法可以处理上述数据特征,但几乎所有方法都是基于模型预测的性能进行评估的。然而,很少有人关注解决一个基本问题:这些评估标准是否合适?大多数特征选择方法通常控制模型拟合,但仅基于预测准确性,无法评估识别有意义的特征子集的能力。如果数据的微小变化会导致所选特征子集的大幅变化,那么许多选定的特征很可能是数据伪影,而不是真实的生物信号。这种识别相关和可重复特征的关键需求促使我们提出了稳定性等可重复性评估标准,它量化了方法对数据扰动的稳健性。在我们的论文中,我们在连续或二进制结果的模拟和实验微生物组应用中,比较了流行的模型预测指标(MSE 或 AUC)与我们提出的稳定性可重复性评估标准,评估了四种广泛使用的特征选择方法的性能。我们得出的结论是,稳定性是比模型预测指标更优的特征选择标准,因为它更好地量化了特征选择方法的可重复性。