基于微阵列数据的预过滤方法对特征选择改进的比较研究。

A comparative study of improvements Pre-filter methods bring on feature selection using microarray data.

机构信息

Research Center for Biomedical Information, Shenzhen Institutes of Advanced Technologies, Chinese Academy of Sciences, Shenzhen, China.

出版信息

Health Inf Sci Syst. 2014 Oct 16;2:7. doi: 10.1186/2047-2501-2-7. eCollection 2014.

DOI:10.1186/2047-2501-2-7

PMID:25825671

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4340279/

Abstract

BACKGROUND

Feature selection techniques have become an apparent need in biomarker discoveries with the development of microarray. However, the high dimensional nature of microarray made feature selection become time-consuming. To overcome such difficulties, filter data according to the background knowledge before applying feature selection techniques has become a hot topic in microarray analysis. Different methods may affect final results greatly, thus it is important to evaluate these pre-filter methods in a system way.

METHODS

In this paper, we compared the performance of statistical-based, biological-based pre-filter methods and the combination of them on microRNA-mRNA parallel expression profiles using L1 logistic regression as feature selection techniques. Four types of data were built for both microRNA and mRNA expression profiles.

RESULTS

Results showed that pre-filter methods could reduce the number of features greatly for both mRNA and microRNA expression datasets. The features selected after pre-filter procedures were shown to be significant in biological levels such as biology process and microRNA functions. Analyses of classification performance based on precision showed the pre-filter methods were necessary when the number of raw features was much bigger than that of samples. All the computing time was greatly shortened after pre-filter procedures.

CONCLUSIONS

With similar or better classification improvements, less but biological significant features, pre-filter-based feature selection should be taken into consideration if researchers need fast results when facing complex computing problems in bioinformatics.

摘要

背景

随着微阵列技术的发展，特征选择技术已成为生物标志物发现的明显需求。然而，微阵列的高维性质使得特征选择变得耗时。为了克服这些困难，在应用特征选择技术之前，根据背景知识对数据进行过滤已成为微阵列分析中的一个热门话题。不同的方法可能会对最终结果产生很大的影响，因此，系统地评估这些预过滤方法非常重要。

方法

在本文中，我们比较了基于统计、基于生物学的预过滤方法及其组合在微 RNA-mRNA 平行表达谱上的性能，使用 L1 逻辑回归作为特征选择技术。为微 RNA 和 mRNA 表达谱构建了四种类型的数据。

结果

结果表明，预过滤方法可以大大减少微 RNA 和 mRNA 表达数据集的特征数量。经过预过滤程序选择的特征在生物学过程和微 RNA 功能等生物学水平上具有显著意义。基于精度的分类性能分析表明，当原始特征的数量远大于样本数量时，预过滤方法是必要的。所有的计算时间在预过滤程序后都大大缩短了。

结论

在具有相似或更好的分类改进的情况下，较少但具有生物学意义的特征，如果研究人员在生物信息学中面临复杂的计算问题时需要快速的结果，基于预过滤的特征选择应该被考虑。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/440a/4340279/7e5c7818fd23/13755_2014_17_Fig1_HTML.jpg

相似文献

A comparative study of improvements Pre-filter methods bring on feature selection using microarray data.基于微阵列数据的预过滤方法对特征选择改进的比较研究。

Health Inf Sci Syst. 2014 Oct 16;2:7. doi: 10.1186/2047-2501-2-7. eCollection 2014.

Filter versus wrapper gene selection approaches in DNA microarray domains.DNA微阵列领域中过滤法与包装法基因选择方法

Artif Intell Med. 2004 Jun;31(2):91-103. doi: 10.1016/j.artmed.2004.01.007.

Improving feature selection performance using pairwise pre-evaluation.使用成对预评估提高特征选择性能。

BMC Bioinformatics. 2016 Aug 20;17:312. doi: 10.1186/s12859-016-1178-3.

Stable feature selection based on the ensemble L -norm support vector machine for biomarker discovery.基于集成L -范数支持向量机的稳定特征选择用于生物标志物发现。

BMC Genomics. 2016 Dec 22;17(Suppl 13):1026. doi: 10.1186/s12864-016-3320-z.

CCFS: A cooperating coevolution technique for large scale feature selection on microarray datasets.CCFS：一种用于微阵列数据集大规模特征选择的协同协同进化技术。

Comput Biol Chem. 2018 Apr;73:171-178. doi: 10.1016/j.compbiolchem.2018.02.006. Epub 2018 Feb 17.

A novel feature selection approach for biomedical data classification.一种用于生物医学数据分类的新特征选择方法。

J Biomed Inform. 2010 Feb;43(1):15-23. doi: 10.1016/j.jbi.2009.07.008. Epub 2009 Jul 30.

Iterative ensemble feature selection for multiclass classification of imbalanced microarray data.用于不平衡微阵列数据多类分类的迭代集成特征选择

J Biol Res (Thessalon). 2016 Jul 4;23(Suppl 1):13. doi: 10.1186/s40709-016-0045-8. eCollection 2016 May.

A new hybrid filter/wrapper algorithm for feature selection in classification.一种用于分类中特征选择的新型混合过滤/包装算法。

Anal Chim Acta. 2019 Nov 8;1080:43-54. doi: 10.1016/j.aca.2019.06.054. Epub 2019 Jun 28.

Cancer microarray data feature selection using multi-objective binary particle swarm optimization algorithm.基于多目标二进制粒子群优化算法的癌症微阵列数据特征选择

EXCLI J. 2016 Aug 1;15:460-473. doi: 10.17179/excli2016-481. eCollection 2016.

Improving classification accuracy of cancer types using parallel hybrid feature selection on microarray gene expression data.利用微阵列基因表达数据的并行混合特征选择提高癌症类型的分类准确性。

Genes Genomics. 2019 Nov;41(11):1301-1313. doi: 10.1007/s13258-019-00859-x. Epub 2019 Aug 19.

引用本文的文献

Identification of key pathways regulated by a set of competitive long non-coding RNAs in oral squamous cell carcinoma.口腔鳞状细胞癌中一组竞争性长链非编码RNA调控的关键通路的鉴定

J Int Med Res. 2019 Apr;47(4):1758-1765. doi: 10.1177/0300060519827190. Epub 2019 Mar 12.

Personalized analysis of pathway aberrance induced by sevoflurane and propofol.七氟醚和丙泊酚诱导的通路失调的个性化分析。

Mol Med Rep. 2017 Oct;16(4):5312-5320. doi: 10.3892/mmr.2017.7305. Epub 2017 Aug 21.

本文引用的文献

Inhibitors of enhancer of zeste homolog 2 (EZH2) activate tumor-suppressor microRNAs in human cancer cells.增强子的 EZH2 抑制剂（EZH2）在人类癌细胞中激活肿瘤抑制 microRNAs。

Oncogenesis. 2014 May 26;3(5):e104. doi: 10.1038/oncsis.2014.17.

p53 is positively regulated by miR-542-3p.p53受miR-542-3p正向调控。

Cancer Res. 2014 Jun 15;74(12):3218-27. doi: 10.1158/0008-5472.CAN-13-1706. Epub 2014 Apr 24.

A novel class dependent feature selection method for cancer biomarker discovery.一种新的基于类别相关特征选择的癌症生物标志物发现方法。

Comput Biol Med. 2014 Apr;47:66-75. doi: 10.1016/j.compbiomed.2014.01.014. Epub 2014 Feb 6.

The Reactome pathway knowledgebase.Reactome 通路知识库。

Nucleic Acids Res. 2014 Jan;42(Database issue):D472-7. doi: 10.1093/nar/gkt1102. Epub 2013 Nov 15.

Data, information, knowledge and principle: back to metabolism in KEGG.数据、信息、知识和原理：回到 KEGG 的代谢途径中。

Nucleic Acids Res. 2014 Jan;42(Database issue):D199-205. doi: 10.1093/nar/gkt1076. Epub 2013 Nov 7.

Biomarker selection and classification of "-omics" data using a two-step bayes classification framework.基于两步贝叶斯分类框架的“组学”数据的生物标志物选择和分类。

Biomed Res Int. 2013;2013:148014. doi: 10.1155/2013/148014. Epub 2013 Sep 11.

miR-346 regulates osteogenic differentiation of human bone marrow-derived mesenchymal stem cells by targeting the Wnt/β-catenin pathway.miR-346 通过靶向 Wnt/β-catenin 通路调节人骨髓间充质干细胞的成骨分化。

PLoS One. 2013 Sep 4;8(9):e72266. doi: 10.1371/journal.pone.0072266. eCollection 2013.

Robustness of chemometrics-based feature selection methods in early cancer detection and biomarker discovery.基于化学计量学的特征选择方法在早期癌症检测和生物标志物发现中的稳健性。

Stat Appl Genet Mol Biol. 2013 Mar 13;12(2):207-23. doi: 10.1515/sagmb-2012-0067.

Gene-expression-based cancer subtypes prediction through feature selection and transductive SVM.基于基因表达的癌症亚型预测：特征选择与转导 SVM 方法

IEEE Trans Biomed Eng. 2013 Apr;60(4):1111-7. doi: 10.1109/TBME.2012.2225622. Epub 2012 Oct 18.

microRNA expression profiling of the developing mouse heart.小鼠心脏发育过程中的 microRNA 表达谱分析。

Int J Mol Med. 2012 Nov;30(5):1095-104. doi: 10.3892/ijmm.2012.1092. Epub 2012 Aug 9.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于微阵列数据的预过滤方法对特征选择改进的比较研究。

A comparative study of improvements Pre-filter methods bring on feature selection using microarray data.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献