Suppr超能文献

一种用于通路分析的支持向量机和自适应绝对收缩与选择算子改进混合方法。

An improved hybrid of SVM and SCAD for pathway analysis.

作者信息

Misman Muhammad Faiz, Mohamad Mohd Saberi, Deris Safaai, Abdullah Afnizanfaizal, Hashim Siti Zaiton Mohd

机构信息

Faculty of Computer Science and Information Systems, Universiti Teknologi Malaysia, 81310, Skudai, Johor Darul Takzim, Malaysia.

出版信息

Bioinformation. 2011;7(4):169-75. doi: 10.6026/97320630007169. Epub 2011 Oct 14.

Abstract

Pathway analysis has lead to a new era in genomic research by providing further biological process information compared to traditional single gene analysis. Beside the advantage, pathway analysis provides some challenges to the researchers, one of which is the quality of pathway data itself. The pathway data usually defined from biological context free, when it comes to a specific biological context (e.g. lung cancer disease), typically only several genes within pathways are responsible for the corresponding cellular process. It also can be that some pathways may be included with uninformative genes or perhaps informative genes were excluded. Moreover, many algorithms in pathway analysis neglect these limitations by treating all the genes within pathways as significant. In previous study, a hybrid of support vector machines and smoothly clipped absolute deviation with groups-specific tuning parameters (gSVM-SCAD) was proposed in order to identify and select the informative genes before the pathway evaluation process. However, gSVM-SCAD had showed a limitation in terms of the performance of classification accuracy. In order to deal with this limitation, we made an enhancement to the tuning parameter method for gSVM-SCAD by applying the B-Type generalized approximate cross validation (BGACV). Experimental analyses using one simulated data and two gene expression data have shown that the proposed method obtains significant results in identifying biologically significant genes and pathways, and in classification accuracy.

摘要

与传统的单基因分析相比,通路分析通过提供更多生物学过程信息,引领了基因组研究的新时代。除了这一优势外,通路分析也给研究人员带来了一些挑战,其中之一就是通路数据本身的质量。通路数据通常是在脱离生物学背景的情况下定义的,当涉及到特定的生物学背景(如肺癌疾病)时,通常只有通路中的几个基因对相应的细胞过程负责。也有可能某些通路包含了无信息的基因,或者有信息的基因被排除在外。此外,通路分析中的许多算法通过将通路中的所有基因都视为有意义的,而忽略了这些局限性。在之前的研究中,为了在通路评估过程之前识别和选择有信息的基因,提出了一种支持向量机与具有组特异性调整参数的平滑截断绝对偏差的混合方法(gSVM-SCAD)。然而,gSVM-SCAD在分类准确率方面表现出了局限性。为了解决这一局限性,我们通过应用B型广义近似交叉验证(BGACV)对gSVM-SCAD的调整参数方法进行了改进。使用一个模拟数据和两个基因表达数据进行的实验分析表明,所提出的方法在识别生物学上有意义的基因和通路以及分类准确率方面都取得了显著成果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9727/3218518/5ee82ae80bc4/97320630007169F1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验