Eichler Gabriel S, Reimers Mark, Kane David, Weinstein John N
Genomics and Bioinformatics Groups, Laboratory of Molecular Pharmacology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA.
Genome Biol. 2007;8(9):R187. doi: 10.1186/gb-2007-8-9-r187.
Interpretation of microarray data remains a challenge, and most methods fail to consider the complex, nonlinear regulation of gene expression. To address that limitation, we introduce Learner of Functional Enrichment (LeFE), a statistical/machine learning algorithm based on Random Forest, and demonstrate it on several diverse datasets: smoker/never smoker, breast cancer classification, and cancer drug sensitivity. We also compare it with previously published algorithms, including Gene Set Enrichment Analysis. LeFE regularly identifies statistically significant functional themes consistent with known biology.
微阵列数据的解读仍然是一项挑战,并且大多数方法未能考虑基因表达的复杂非线性调控。为解决这一局限性,我们引入了功能富集学习器(LeFE),这是一种基于随机森林的统计/机器学习算法,并在几个不同的数据集上进行了验证:吸烟者/从不吸烟者、乳腺癌分类以及癌症药物敏感性。我们还将其与先前发表的算法进行了比较,包括基因集富集分析。LeFE能够定期识别出与已知生物学一致的具有统计学意义的功能主题。