Ibrahim Rania, Yousri Noha A, Ismail Mohamed A, El-Makky Nagwa M
Annu Int Conf IEEE Eng Med Biol Soc. 2014;2014:3957-60. doi: 10.1109/EMBC.2014.6944490.
Selecting the most discriminative genes/miRNAs has been raised as an important task in bioinformatics to enhance disease classifiers and to mitigate the dimensionality curse problem. Original feature selection methods choose genes/miRNAs based on their individual features regardless of how they perform together. Considering group features instead of individual ones provides a better view for selecting the most informative genes/miRNAs. Recently, deep learning has proven its ability in representing the data in multiple levels of abstraction, allowing for better discrimination between different classes. However, the idea of using deep learning for feature selection is not widely used in the bioinformatics field yet. In this paper, a novel multi-level feature selection approach named MLFS is proposed for selecting genes/miRNAs based on expression profiles. The approach is based on both deep and active learning. Moreover, an extension to use the technique for miRNAs is presented by considering the biological relation between miRNAs and genes. Experimental results show that the approach was able to outperform classical feature selection methods in hepatocellular carcinoma (HCC) by 9%, lung cancer by 6% and breast cancer by around 10% in F1-measure. Results also show the enhancement in F1-measure of our approach over recently related work in [1] and [2].
选择最具区分性的基因/微小RNA(miRNA)已成为生物信息学中的一项重要任务,以增强疾病分类器并缓解维度诅咒问题。原始的特征选择方法基于基因/miRNA的个体特征进行选择,而不考虑它们共同的表现。考虑组特征而非个体特征能为选择最具信息性的基因/miRNA提供更好的视角。最近,深度学习已证明其在多层次抽象层面表示数据的能力,从而能够更好地区分不同类别。然而,将深度学习用于特征选择的想法在生物信息学领域尚未得到广泛应用。本文提出了一种名为MLFS的新型多层次特征选择方法,用于基于表达谱选择基因/miRNA。该方法基于深度学习和主动学习。此外,通过考虑miRNA与基因之间的生物学关系,提出了将该技术扩展用于miRNA的方法。实验结果表明,该方法在肝细胞癌(HCC)中,F1度量比经典特征选择方法高出9%;在肺癌中高出6%;在乳腺癌中高出约10%。结果还表明,我们的方法在F1度量上比[1]和[2]中最近的相关工作有所提高。