Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, United States.
Center for Artificial Intelligence and Modeling, The Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Champaign, IL 61820, United States.
Bioinformatics. 2023 Apr 3;39(4). doi: 10.1093/bioinformatics/btad135.
Feature selection is a powerful dimension reduction technique which selects a subset of relevant features for model construction. Numerous feature selection methods have been proposed, but most of them fail under the high-dimensional and low-sample size (HDLSS) setting due to the challenge of overfitting.
We present a deep learning-based method-GRAph Convolutional nEtwork feature Selector (GRACES)-to select important features for HDLSS data. GRACES exploits latent relations between samples with various overfitting-reducing techniques to iteratively find a set of optimal features which gives rise to the greatest decreases in the optimization loss. We demonstrate that GRACES significantly outperforms other feature selection methods on both synthetic and real-world datasets.
The source code is publicly available at https://github.com/canc1993/graces.
特征选择是一种强大的降维技术,它可以为模型构建选择一组相关特征。已经提出了许多特征选择方法,但由于过拟合的挑战,它们大多数在高维低样本量(HDLSS)环境下失败。
我们提出了一种基于深度学习的方法-GRAPH Convolutional nEtwork feature Selector(GRACES)-用于选择 HDLSS 数据的重要特征。GRACES 利用各种过拟合减少技术来挖掘样本之间的潜在关系,以迭代地找到一组最优特征,从而导致优化损失的最大减少。我们证明了 GRACES 在合成数据集和真实数据集上都显著优于其他特征选择方法。
源代码可在 https://github.com/canc1993/graces 上获得。