• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

结合多种方法进行基因微阵列分类。

Combining multiple approaches for gene microarray classification.

机构信息

Department of Information Engineering, University of Padua, Padova, Italy.

出版信息

Bioinformatics. 2012 Apr 15;28(8):1151-7. doi: 10.1093/bioinformatics/bts108. Epub 2012 Mar 5.

DOI:10.1093/bioinformatics/bts108
PMID:22390939
Abstract

MOTIVATION

The microarray report measures the expressions of tens of thousands of genes, producing a feature vector that is high in dimensionality and that contains much irrelevant information. This dimensionality degrades classification performance. Moreover, datasets typically contain few samples for training, leading to the 'curse of dimensionality' problem. It is essential, therefore, to find good methods for reducing the size of the feature set.

RESULTS

In this article, we propose a method for gene microarray classification that combines different feature reduction approaches for improving classification performance. Using a support vector machine (SVM) as our classifier, we examine an SVM trained using a set of selected genes; an SVM trained using the feature set obtained by Neighborhood Preserving Embedding feature transform; a set of SVMs trained using a set of orthogonal wavelet coefficients of different wavelet mothers; a set of SVMs trained using texture descriptors extracted from the microarray, considering it as an image; and an ensemble that combines the best feature extraction methods listed above. The positive results reported offer confirmation that combining different features extraction methods greatly enhances system performance. The experiments were performed using several different datasets, and our results [expressed as both accuracy and area under the receiver operating characteristic (ROC) curve] show the goodness of the proposed approach with respect to the state of the art.

AVAILABILITY

The MATHLAB code of the proposed approach is publicly available at bias.csr.unibo.it/nanni/micro.rar.

摘要

动机

微阵列报告测量了数以万计的基因的表达,产生了一个维度很高的特征向量,其中包含了很多不相关的信息。这种维度降低了分类性能。此外,数据集通常包含很少的训练样本,导致了“维度诅咒”问题。因此,找到减少特征集大小的好方法是至关重要的。

结果

在本文中,我们提出了一种用于基因微阵列分类的方法,该方法结合了不同的特征降维方法,以提高分类性能。使用支持向量机(SVM)作为我们的分类器,我们检查了使用一组选定基因训练的 SVM;使用邻域保持嵌入特征变换获得的特征集训练的 SVM;使用不同母小波的正交小波系数集训练的一组 SVM;使用从微阵列中提取的纹理描述符(将其视为图像)训练的一组 SVM;以及结合上述最佳特征提取方法的集成。报告的积极结果证实了组合使用不同的特征提取方法可以大大提高系统性能。实验使用了几个不同的数据集,我们的结果(表示为准确性和接收器操作特性(ROC)曲线下的面积)表明了所提出的方法相对于现有技术的优越性。

可用性

拟议方法的 MATHLAB 代码可在 bias.csr.unibo.it/nanni/micro.rar 处公开获取。

相似文献

1
Combining multiple approaches for gene microarray classification.结合多种方法进行基因微阵列分类。
Bioinformatics. 2012 Apr 15;28(8):1151-7. doi: 10.1093/bioinformatics/bts108. Epub 2012 Mar 5.
2
Texture Descriptors Ensembles Enable Image-Based Classification of Maturation of Human Stem Cell-Derived Retinal Pigmented Epithelium.纹理描述符集成实现基于图像的人干细胞衍生视网膜色素上皮成熟度分类。
PLoS One. 2016 Feb 19;11(2):e0149399. doi: 10.1371/journal.pone.0149399. eCollection 2016.
3
A classifier ensemble approach for the missing feature problem.分类器集成方法解决缺失特征问题。
Artif Intell Med. 2012 May;55(1):37-50. doi: 10.1016/j.artmed.2011.11.006. Epub 2011 Dec 20.
4
Texture descriptors and voxels for the early diagnosis of Alzheimer's disease.纹理描述符和体素在阿尔茨海默病早期诊断中的应用。
Artif Intell Med. 2019 Jun;97:19-26. doi: 10.1016/j.artmed.2019.05.003. Epub 2019 May 18.
5
Wavelet images and Chou's pseudo amino acid composition for protein classification.小波图像和 Chou 的伪氨基酸组成用于蛋白质分类。
Amino Acids. 2012 Aug;43(2):657-65. doi: 10.1007/s00726-011-1114-9. Epub 2011 Oct 13.
6
A novel gene selection algorithm for cancer classification using microarray datasets.一种使用微阵列数据集进行癌症分类的新基因选择算法。
BMC Med Genomics. 2019 Jan 15;12(1):10. doi: 10.1186/s12920-018-0447-6.
7
A discrete wavelet based feature extraction and hybrid classification technique for microarray data analysis.一种基于离散小波的微阵列数据分析特征提取与混合分类技术。
ScientificWorldJournal. 2014;2014:195470. doi: 10.1155/2014/195470. Epub 2014 Aug 6.
8
Metaheuristic integrated machine learning classification of colon cancer using STFT LASSO and EHO feature extraction from microarray gene expressions.基于短时傅里叶变换(STFT)套索和从微阵列基因表达中提取的帝王蝶优化算法(EHO)特征的元启发式集成机器学习结肠癌分类法
Sci Rep. 2024 Jul 17;14(1):16485. doi: 10.1038/s41598-024-67135-1.
9
Effect of finite sample size on feature selection and classification: a simulation study.有限样本大小对特征选择和分类的影响:一项模拟研究。
Med Phys. 2010 Feb;37(2):907-20. doi: 10.1118/1.3284974.
10
Development of a two-stage gene selection method that incorporates a novel hybrid approach using the cuckoo optimization algorithm and harmony search for cancer classification.一种两阶段基因选择方法的开发,该方法结合了一种使用布谷鸟优化算法和和声搜索的新型混合方法用于癌症分类。
J Biomed Inform. 2017 Mar;67:11-20. doi: 10.1016/j.jbi.2017.01.016. Epub 2017 Feb 3.

引用本文的文献

1
Cross study transcriptomic investigation of Alzheimer's brain tissue discoveries and limitations.阿尔茨海默病脑组织的跨研究转录组学调查:发现与局限
Sci Rep. 2025 May 8;15(1):16041. doi: 10.1038/s41598-025-01017-y.
2
Epigenetic and Tumor Microenvironment for Prognosis of Patients with Gastric Cancer.胃癌患者预后的表观遗传学和肿瘤微环境。
Biomolecules. 2023 Apr 25;13(5):736. doi: 10.3390/biom13050736.
3
Identification by genetic algorithm optimized back propagation artificial neural network and validation of a four-gene signature for diagnosis and prognosis of pancreatic cancer.
通过遗传算法优化的反向传播人工神经网络进行识别以及验证用于胰腺癌诊断和预后的四基因特征
Heliyon. 2022 Nov 9;8(11):e11321. doi: 10.1016/j.heliyon.2022.e11321. eCollection 2022 Nov.
4
Feature selection revisited in the single-cell era.单细胞时代的特征选择再探讨。
Genome Biol. 2021 Dec 1;22(1):321. doi: 10.1186/s13059-021-02544-3.
5
Integrative transcriptomic, proteomic, and machine learning approach to identifying feature genes of atrial fibrillation using atrial samples from patients with valvular heart disease.综合转录组学、蛋白质组学和机器学习方法,利用瓣膜性心脏病患者的心房样本识别心房颤动的特征基因。
BMC Cardiovasc Disord. 2021 Jan 28;21(1):52. doi: 10.1186/s12872-020-01819-0.
6
COL3A1, COL6A3, and SERPINH1 Are Related to Glucocorticoid-Induced Osteoporosis Occurrence According to Integrated Bioinformatics Analysis.综合生物信息学分析表明,COL3A1、COL6A3和SERPINH1与糖皮质激素诱导的骨质疏松症发生有关。
Med Sci Monit. 2020 Oct 1;26:e925474. doi: 10.12659/MSM.925474.
7
Multi-scale supervised clustering-based feature selection for tumor classification and identification of biomarkers and targets on genomic data.基于多尺度监督聚类的特征选择在肿瘤分类和基因组数据的生物标志物和靶标鉴定中的应用。
BMC Genomics. 2020 Sep 22;21(1):650. doi: 10.1186/s12864-020-07038-3.
8
Uncovering the prognostic gene signatures for the improvement of risk stratification in cancers by using deep learning algorithm coupled with wavelet transform.利用深度学习算法结合小波变换揭示癌症风险分层改善的预后基因特征。
BMC Bioinformatics. 2020 May 19;21(1):195. doi: 10.1186/s12859-020-03544-z.
9
Photosynthetic protein classification using genome neighborhood-based machine learning feature.基于基因组邻域的机器学习特征进行光合作用蛋白分类。
Sci Rep. 2020 Apr 28;10(1):7108. doi: 10.1038/s41598-020-64053-w.
10
Identification of the gene signature reflecting schizophrenia's etiology by constructing artificial intelligence-based method of enhanced reproducibility.通过构建基于人工智能的增强可重复性方法,识别反映精神分裂症病因的基因特征。
CNS Neurosci Ther. 2019 Sep;25(9):1054-1063. doi: 10.1111/cns.13196. Epub 2019 Jul 27.