Asyali Musa H
Department of Computer Engineering, Yasar University, Kazim Dirik Mah. 364 Sok. No: 5, Bornova 35500, Izmir, Turkey.
Comput Biol Med. 2007 Dec;37(12):1690-9. doi: 10.1016/j.compbiomed.2007.04.001. Epub 2007 May 22.
Due to recent advances in DNA microarray technology, using gene expression profiles, diagnostic category of tissue samples can be predicted with high accuracy. In this study, we discuss shortcomings of some existing gene expression profile classification methods and propose a new approach based on linear Bayesian classifiers. In our approach, we first construct gene-level linear classifiers to identify genes that provide high class-prediction accuracies, i.e., low error rates. After this screening phase, starting with the gene that offers the lowest error rate, we construct a multi-dimensional linear classifier by incorporating next best-performing genes, until the prediction error becomes minimum or 0, if possible. When we compared classification performance of our approach against prediction analysis of microarrays (PAM) and support vector machines (SVM) based approaches, we found that our method outperforms PAM and produces comparable results with SVM. In addition, we observed that the gene selection scheme of PAM could be misleading. Albeit SVM achieves relatively higher prediction performance, it has two major disadvantages: Complexity and lack of insight about important genes. Our intuitive approach offers competing performance and also an efficient means for finding important genes.
由于DNA微阵列技术的最新进展,利用基因表达谱,可以高精度地预测组织样本的诊断类别。在本研究中,我们讨论了一些现有基因表达谱分类方法的缺点,并提出了一种基于线性贝叶斯分类器的新方法。在我们的方法中,我们首先构建基因水平的线性分类器,以识别能够提供高类别预测准确率(即低错误率)的基因。在这个筛选阶段之后,从错误率最低的基因开始,我们通过纳入下一个性能最佳的基因来构建一个多维线性分类器,直到预测误差最小或尽可能为0。当我们将我们的方法的分类性能与基于微阵列预测分析(PAM)和支持向量机(SVM)的方法进行比较时,我们发现我们的方法优于PAM,并且与SVM产生相当的结果。此外,我们观察到PAM的基因选择方案可能会产生误导。尽管SVM实现了相对较高的预测性能,但它有两个主要缺点:复杂性和对重要基因缺乏洞察力。我们直观的方法提供了具有竞争力的性能,也是寻找重要基因的有效手段。