• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于最小二乘支持向量机的微阵列数据基因选择算法

Gene selection algorithms for microarray data based on least squares support vector machine.

作者信息

Tang E Ke, Suganthan P N, Yao Xin

机构信息

School of Electrical & Electronic Engineering, Nanyang Technological University, Singapore.

出版信息

BMC Bioinformatics. 2006 Feb 27;7:95. doi: 10.1186/1471-2105-7-95.

DOI:10.1186/1471-2105-7-95
PMID:16504159
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1409801/
Abstract

BACKGROUND

In discriminant analysis of microarray data, usually a small number of samples are expressed by a large number of genes. It is not only difficult but also unnecessary to conduct the discriminant analysis with all the genes. Hence, gene selection is usually performed to select important genes.

RESULTS

A gene selection method searches for an optimal or near optimal subset of genes with respect to a given evaluation criterion. In this paper, we propose a new evaluation criterion, named the leave-one-out calculation (LOOC, A list of abbreviations appears just above the list of references) measure. A gene selection method, named leave-one-out calculation sequential forward selection (LOOCSFS) algorithm, is then presented by combining the LOOC measure with the sequential forward selection scheme. Further, a novel gene selection algorithm, the gradient-based leave-one-out gene selection (GLGS) algorithm, is also proposed. Both of the gene selection algorithms originate from an efficient and exact calculation of the leave-one-out cross-validation error of the least squares support vector machine (LS-SVM). The proposed approaches are applied to two microarray datasets and compared to other well-known gene selection methods using codes available from the second author.

CONCLUSION

The proposed gene selection approaches can provide gene subsets leading to more accurate classification results, while their computational complexity is comparable to the existing methods. The GLGS algorithm can also better scale to datasets with a very large number of genes.

摘要

背景

在微阵列数据的判别分析中,通常少量样本由大量基因来表示。对所有基因进行判别分析不仅困难而且没有必要。因此,通常会进行基因选择以挑选出重要基因。

结果

一种基因选择方法会根据给定的评估标准搜索基因的最优或接近最优子集。在本文中,我们提出了一种新的评估标准,称为留一法计算(LOOC,缩写列表恰好在参考文献列表上方)度量。然后通过将LOOC度量与顺序向前选择方案相结合,提出了一种名为留一法计算顺序向前选择(LOOCSFS)算法的基因选择方法。此外,还提出了一种新颖的基因选择算法,即基于梯度的留一法基因选择(GLGS)算法。这两种基因选择算法均源自对最小二乘支持向量机(LS-SVM)的留一法交叉验证误差的高效且精确的计算。所提出的方法应用于两个微阵列数据集,并使用第二作者提供的代码与其他知名的基因选择方法进行比较。

结论

所提出的基因选择方法能够提供导致更准确分类结果的基因子集,同时其计算复杂度与现有方法相当。GLGS算法在处理具有大量基因的数据集时也能更好地扩展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/1409801/fb8f1940f8e9/1471-2105-7-95-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/1409801/661712fe6e82/1471-2105-7-95-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/1409801/9414958322ed/1471-2105-7-95-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/1409801/568ac86b81ba/1471-2105-7-95-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/1409801/a563d968330a/1471-2105-7-95-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/1409801/ef9812ad49da/1471-2105-7-95-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/1409801/fb8f1940f8e9/1471-2105-7-95-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/1409801/661712fe6e82/1471-2105-7-95-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/1409801/9414958322ed/1471-2105-7-95-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/1409801/568ac86b81ba/1471-2105-7-95-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/1409801/a563d968330a/1471-2105-7-95-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/1409801/ef9812ad49da/1471-2105-7-95-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/1409801/fb8f1940f8e9/1471-2105-7-95-6.jpg

相似文献

1
Gene selection algorithms for microarray data based on least squares support vector machine.基于最小二乘支持向量机的微阵列数据基因选择算法
BMC Bioinformatics. 2006 Feb 27;7:95. doi: 10.1186/1471-2105-7-95.
2
Development of two-stage SVM-RFE gene selection strategy for microarray expression data analysis.用于微阵列表达数据分析的两阶段支持向量机-递归特征消除基因选择策略的开发。
IEEE/ACM Trans Comput Biol Bioinform. 2007 Jul-Sep;4(3):365-81. doi: 10.1109/TCBB.2007.70224.
3
Kernel-imbedded Gaussian processes for disease classification using microarray gene expression data.使用微阵列基因表达数据的用于疾病分类的核嵌入高斯过程。
BMC Bioinformatics. 2007 Feb 28;8:67. doi: 10.1186/1471-2105-8-67.
4
LS Bound based gene selection for DNA microarray data.基于LS边界的DNA微阵列数据基因选择
Bioinformatics. 2005 Apr 15;21(8):1559-64. doi: 10.1093/bioinformatics/bti216. Epub 2004 Dec 14.
5
Recursive gene selection based on maximum margin criterion: a comparison with SVM-RFE.基于最大间隔准则的递归基因选择:与支持向量机递归特征消除法的比较
BMC Bioinformatics. 2006 Dec 25;7:543. doi: 10.1186/1471-2105-7-543.
6
MSVM-RFE: extensions of SVM-RFE for multiclass gene selection on DNA microarray data.MSVM-RFE:用于DNA微阵列数据多类基因选择的SVM-RFE扩展方法
Bioinformatics. 2007 May 1;23(9):1106-14. doi: 10.1093/bioinformatics/btm036.
7
A stable gene selection in microarray data analysis.微阵列数据分析中的稳定基因选择。
BMC Bioinformatics. 2006 Apr 27;7:228. doi: 10.1186/1471-2105-7-228.
8
Regularized Least Squares Cancer classifiers from DNA microarray data.基于DNA微阵列数据的正则化最小二乘癌症分类器。
BMC Bioinformatics. 2005 Dec 1;6 Suppl 4(Suppl 4):S2. doi: 10.1186/1471-2105-6-S4-S2.
9
Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data.用于质谱和微阵列数据的递归支持向量机特征选择与样本分类
BMC Bioinformatics. 2006 Apr 10;7:197. doi: 10.1186/1471-2105-7-197.
10
A unified framework for finding differentially expressed genes from microarray experiments.一种从微阵列实验中寻找差异表达基因的统一框架。
BMC Bioinformatics. 2007 Sep 18;8:347. doi: 10.1186/1471-2105-8-347.

引用本文的文献

1
Classification Model for Diabetic Foot, Necrotizing Fasciitis, and Osteomyelitis.糖尿病足、坏死性筋膜炎和骨髓炎的分类模型
Biology (Basel). 2022 Sep 3;11(9):1310. doi: 10.3390/biology11091310.
2
Five weeks of intermittent transcutaneous vagus nerve stimulation shape neural networks: a machine learning approach.五周的间断经皮迷走神经刺激可塑造神经网络:一种机器学习方法。
Brain Imaging Behav. 2022 Jun;16(3):1217-1233. doi: 10.1007/s11682-021-00572-y. Epub 2021 Dec 29.
3
Genetic variations analysis for complex brain disease diagnosis using machine learning techniques: opportunities and hurdles.

本文引用的文献

1
Feature selection and classification for microarray data analysis: evolutionary methods for identifying predictive genes.用于微阵列数据分析的特征选择与分类:识别预测基因的进化方法
BMC Bioinformatics. 2005 Jun 15;6:148. doi: 10.1186/1471-2105-6-148.
2
An entropy-based gene selection method for cancer classification using microarray data.一种基于熵的利用微阵列数据进行癌症分类的基因选择方法。
BMC Bioinformatics. 2005 Mar 24;6:76. doi: 10.1186/1471-2105-6-76.
3
Evaluation of gene importance in microarray data based upon probability of selection.
使用机器学习技术进行复杂脑部疾病诊断的基因变异分析:机遇与障碍
PeerJ Comput Sci. 2021 Sep 20;7:e697. doi: 10.7717/peerj-cs.697. eCollection 2021.
4
MagIO: Magnetic Field Strength Based Indoor- Outdoor Detection with a Commercial Smartphone.MagIO:基于磁场强度的商用智能手机室内外检测
Micromachines (Basel). 2018 Oct 20;9(10):534. doi: 10.3390/mi9100534.
5
A Review of Feature Selection and Feature Extraction Methods Applied on Microarray Data.应用于微阵列数据的特征选择与特征提取方法综述
Adv Bioinformatics. 2015;2015:198363. doi: 10.1155/2015/198363. Epub 2015 Jun 11.
6
Analyzing kernel matrices for the identification of differentially expressed genes.分析核矩阵以识别差异表达基因。
PLoS One. 2013 Dec 9;8(12):e81683. doi: 10.1371/journal.pone.0081683. eCollection 2013.
7
Fusing Gene Interaction to Improve Disease Discrimination on Classification Analysis.融合基因相互作用以在分类分析中改善疾病鉴别
Adv Genet Eng. 2012 Feb 9;1(1):1000102. doi: 10.4172/AGE.1000102.
8
Wrapper-based selection of genetic features in genome-wide association studies through fast matrix operations.通过快速矩阵运算在全基因组关联研究中基于包装法选择遗传特征。
Algorithms Mol Biol. 2012 May 2;7(1):11. doi: 10.1186/1748-7188-7-11.
9
Genome-wide polycomb target gene prediction in Drosophila melanogaster.在黑腹果蝇中全基因组多梳靶基因预测。
Nucleic Acids Res. 2012 Jul;40(13):5848-63. doi: 10.1093/nar/gks209. Epub 2012 Mar 13.
10
Gene selection and classification for cancer microarray data based on machine learning and similarity measures.基于机器学习和相似性度量的癌症基因芯片数据选择与分类。
BMC Genomics. 2011 Dec 23;12 Suppl 5(Suppl 5):S1. doi: 10.1186/1471-2164-12-S5-S1.
基于选择概率评估微阵列数据中的基因重要性。
BMC Bioinformatics. 2005 Mar 22;6:67. doi: 10.1186/1471-2105-6-67.
4
A robust hybrid between genetic algorithm and support vector machine for extracting an optimal feature gene subset.一种用于提取最优特征基因子集的遗传算法与支持向量机的强大混合方法。
Genomics. 2005 Jan;85(1):16-23. doi: 10.1016/j.ygeno.2004.09.007.
5
LS Bound based gene selection for DNA microarray data.基于LS边界的DNA微阵列数据基因选择
Bioinformatics. 2005 Apr 15;21(8):1559-64. doi: 10.1093/bioinformatics/bti216. Epub 2004 Dec 14.
6
Fast exact leave-one-out cross-validation of sparse least-squares support vector machines.稀疏最小二乘支持向量机的快速精确留一法交叉验证
Neural Netw. 2004 Dec;17(10):1467-75. doi: 10.1016/j.neunet.2004.07.002.
7
Systematic benchmarking of microarray data classification: assessing the role of non-linearity and dimensionality reduction.微阵列数据分类的系统基准测试:评估非线性和降维的作用。
Bioinformatics. 2004 Nov 22;20(17):3185-95. doi: 10.1093/bioinformatics/bth383. Epub 2004 Jul 1.
8
Gene mining: a novel and powerful ensemble decision approach to hunting for disease genes using microarray expression profiling.基因挖掘:一种利用微阵列表达谱寻找疾病基因的新颖且强大的集成决策方法。
Nucleic Acids Res. 2004 May 17;32(9):2685-94. doi: 10.1093/nar/gkh563. Print 2004.
9
Is cross-validation valid for small-sample microarray classification?交叉验证对小样本微阵列分类是否有效?
Bioinformatics. 2004 Feb 12;20(3):374-80. doi: 10.1093/bioinformatics/btg419.
10
Gene expression-based classification of malignant gliomas correlates better with survival than histological classification.基于基因表达的恶性胶质瘤分类与生存的相关性比组织学分类更好。
Cancer Res. 2003 Apr 1;63(7):1602-7.