Suppr超能文献

基于特征选择和半监督学习的基因表达谱数据癌症生物标志物识别

Identifying Cancer Biomarkers From Microarray Data Using Feature Selection and Semisupervised Learning.

出版信息

IEEE J Transl Eng Health Med. 2014 Dec 2;2:4300211. doi: 10.1109/JTEHM.2014.2375820. eCollection 2014.

Abstract

Microarrays have now gone from obscurity to being almost ubiquitous in biological research. At the same time, the statistical methodology for microarray analysis has progressed from simple visual assessments of results to novel algorithms for analyzing changes in expression profiles. In a micro-RNA (miRNA) or gene-expression profiling experiment, the expression levels of thousands of genes/miRNAs are simultaneously monitored to study the effects of certain treatments, diseases, and developmental stages on their expressions. Microarray-based gene expression profiling can be used to identify genes, whose expressions are changed in response to pathogens or other organisms by comparing gene expression in infected to that in uninfected cells or tissues. Recent studies have revealed that patterns of altered microarray expression profiles in cancer can serve as molecular biomarkers for tumor diagnosis, prognosis of disease-specific outcomes, and prediction of therapeutic responses. Microarray data sets containing expression profiles of a number of miRNAs or genes are used to identify biomarkers, which have dysregulation in normal and malignant tissues. However, small sample size remains a bottleneck to design successful classification methods. On the other hand, adequate number of microarray data that do not have clinical knowledge can be employed as additional source of information. In this paper, a combination of kernelized fuzzy rough set (KFRS) and semisupervised support vector machine (S(3)VM) is proposed for predicting cancer biomarkers from one miRNA and three gene expression data sets. Biomarkers are discovered employing three feature selection methods, including KFRS. The effectiveness of the proposed KFRS and S(3)VM combination on the microarray data sets is demonstrated, and the cancer biomarkers identified from miRNA data are reported. Furthermore, biological significance tests are conducted for miRNA cancer biomarkers.

摘要

微阵列已经从默默无闻发展到几乎在生物学研究中无处不在。与此同时,微阵列分析的统计方法已经从简单的结果视觉评估发展到用于分析表达谱变化的新算法。在 micro-RNA (miRNA) 或基因表达谱分析实验中,同时监测数千个基因/miRNA 的表达水平,以研究特定处理、疾病和发育阶段对其表达的影响。基于微阵列的基因表达谱分析可用于通过比较感染细胞或组织与未感染细胞或组织中的基因表达来识别因病原体或其他生物体而改变表达的基因。最近的研究表明,癌症中改变的微阵列表达谱模式可以作为肿瘤诊断、疾病特异性结局的预后以及治疗反应预测的分子生物标志物。使用包含多个 miRNA 或基因表达谱的微阵列数据集来识别在正常和恶性组织中失调的生物标志物。然而,小样本量仍然是设计成功分类方法的瓶颈。另一方面,可以利用具有临床知识的大量微阵列数据作为额外的信息来源。在本文中,提出了一种基于核模糊粗糙集 (KFRS) 和半监督支持向量机 (S(3)VM) 的组合方法,用于从一个 miRNA 和三个基因表达数据集预测癌症生物标志物。采用包括 KFRS 在内的三种特征选择方法来发现生物标志物。该方法在微阵列数据集上的有效性得到了证明,并报告了从 miRNA 数据中识别出的癌症生物标志物。此外,还对 miRNA 癌症生物标志物进行了生物学意义测试。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0e7/4848046/476209e9aacc/chakr1-2375820.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验