• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用微阵列数据挖掘和基因本体论识别致病基因。

Identification of disease-causing genes using microarray data mining and Gene Ontology.

机构信息

Intelligent Databases, Data mining and Bioinformatics Laboratory, Isfahan University of Technology, Isfahan, Iran.

出版信息

BMC Med Genomics. 2011 Jan 26;4:12. doi: 10.1186/1755-8794-4-12.

DOI:10.1186/1755-8794-4-12
PMID:21269461
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3037837/
Abstract

BACKGROUND

One of the best and most accurate methods for identifying disease-causing genes is monitoring gene expression values in different samples using microarray technology. One of the shortcomings of microarray data is that they provide a small quantity of samples with respect to the number of genes. This problem reduces the classification accuracy of the methods, so gene selection is essential to improve the predictive accuracy and to identify potential marker genes for a disease. Among numerous existing methods for gene selection, support vector machine-based recursive feature elimination (SVMRFE) has become one of the leading methods, but its performance can be reduced because of the small sample size, noisy data and the fact that the method does not remove redundant genes.

METHODS

We propose a novel framework for gene selection which uses the advantageous features of conventional methods and addresses their weaknesses. In fact, we have combined the Fisher method and SVMRFE to utilize the advantages of a filtering method as well as an embedded method. Furthermore, we have added a redundancy reduction stage to address the weakness of the Fisher method and SVMRFE. In addition to gene expression values, the proposed method uses Gene Ontology which is a reliable source of information on genes. The use of Gene Ontology can compensate, in part, for the limitations of microarrays, such as having a small number of samples and erroneous measurement results.

RESULTS

The proposed method has been applied to colon, Diffuse Large B-Cell Lymphoma (DLBCL) and prostate cancer datasets. The empirical results show that our method has improved classification performance in terms of accuracy, sensitivity and specificity. In addition, the study of the molecular function of selected genes strengthened the hypothesis that these genes are involved in the process of cancer growth.

CONCLUSIONS

The proposed method addresses the weakness of conventional methods by adding a redundancy reduction stage and utilizing Gene Ontology information. It predicts marker genes for colon, DLBCL and prostate cancer with a high accuracy. The predictions made in this study can serve as a list of candidates for subsequent wet-lab verification and might help in the search for a cure for cancers.

摘要

背景

使用微阵列技术监测不同样本中的基因表达值是鉴定致病基因的最佳和最准确的方法之一。微阵列数据的一个缺点是,相对于基因数量,它们提供的样本数量较少。这个问题降低了方法的分类准确性,因此基因选择对于提高预测准确性和识别疾病的潜在标记基因至关重要。在众多现有的基因选择方法中,基于支持向量机的递归特征消除(SVMRFE)已成为领先方法之一,但由于样本量小、数据噪声以及该方法无法去除冗余基因,其性能可能会降低。

方法

我们提出了一种新的基因选择框架,该框架利用了传统方法的优势,并解决了它们的弱点。实际上,我们已经结合了 Fisher 方法和 SVMRFE,以利用过滤方法和嵌入式方法的优势。此外,我们还添加了一个冗余减少阶段,以解决 Fisher 方法和 SVMRFE 的弱点。除了基因表达值之外,该方法还使用了基因本体论,这是基因信息的可靠来源。基因本体论的使用可以在一定程度上弥补微阵列的局限性,例如样本数量少和测量结果有误。

结果

该方法已应用于结肠癌、弥漫性大 B 细胞淋巴瘤(DLBCL)和前列腺癌数据集。实验结果表明,我们的方法在准确性、敏感性和特异性方面提高了分类性能。此外,对选定基因的分子功能的研究加强了这些基因参与癌症生长过程的假设。

结论

该方法通过添加冗余减少阶段并利用基因本体论信息来解决传统方法的弱点。它以高精度预测结肠癌、DLBCL 和前列腺癌的标记基因。本研究中的预测可以作为后续湿实验室验证的候选名单,并可能有助于寻找癌症的治疗方法。

相似文献

1
Identification of disease-causing genes using microarray data mining and Gene Ontology.利用微阵列数据挖掘和基因本体论识别致病基因。
BMC Med Genomics. 2011 Jan 26;4:12. doi: 10.1186/1755-8794-4-12.
2
A combinational feature selection and ensemble neural network method for classification of gene expression data.一种用于基因表达数据分类的组合特征选择与集成神经网络方法。
BMC Bioinformatics. 2004 Sep 27;5:136. doi: 10.1186/1471-2105-5-136.
3
A novel gene selection algorithm for cancer classification using microarray datasets.一种使用微阵列数据集进行癌症分类的新基因选择算法。
BMC Med Genomics. 2019 Jan 15;12(1):10. doi: 10.1186/s12920-018-0447-6.
4
Classification between normal and tumor tissues based on the pair-wise gene expression ratio.基于成对基因表达比率对正常组织和肿瘤组织进行分类。
BMC Cancer. 2004 Oct 7;4:72. doi: 10.1186/1471-2407-4-72.
5
A robust hybrid approach based on estimation of distribution algorithm and support vector machine for hunting candidate disease genes.一种基于分布估计算法和支持向量机的强大混合方法用于寻找候选疾病基因。
ScientificWorldJournal. 2013;2013:393570. doi: 10.1155/2013/393570. Epub 2013 Feb 7.
6
Feature selection and classification of MAQC-II breast cancer and multiple myeloma microarray gene expression data.MAQC-II 乳腺癌和多发性骨髓瘤基因表达数据的特征选择和分类。
PLoS One. 2009 Dec 11;4(12):e8250. doi: 10.1371/journal.pone.0008250.
7
Stable gene selection from microarray data via sample weighting.基于样本加权的基因芯片数据中稳定基因的选择。
IEEE/ACM Trans Comput Biol Bioinform. 2012 Jan-Feb;9(1):262-72. doi: 10.1109/TCBB.2011.47. Epub 2011 Mar 3.
8
Accurate molecular classification of cancer using simple rules.使用简单规则进行准确的癌症分子分类。
BMC Med Genomics. 2009 Oct 30;2:64. doi: 10.1186/1755-8794-2-64.
9
Detecting biomarkers from microarray data using distributed correlation based gene selection.基于分布式相关的基因选择从微阵列数据中检测生物标志物。
Genes Genomics. 2020 Apr;42(4):449-465. doi: 10.1007/s13258-020-00916-w. Epub 2020 Feb 10.
10
The feature selection bias problem in relation to high-dimensional gene data.与高维基因数据相关的特征选择偏差问题。
Artif Intell Med. 2016 Jan;66:63-71. doi: 10.1016/j.artmed.2015.11.001. Epub 2015 Nov 14.

引用本文的文献

1
Measurement of Conditional Relatedness Between Genes Using Fully Convolutional Neural Network.使用全卷积神经网络测量基因之间的条件相关性
Front Genet. 2019 Oct 22;10:1009. doi: 10.3389/fgene.2019.01009. eCollection 2019.
2
Prediction of key regulators and downstream targets of E. coli induced mastitis.预测大肠杆菌诱导乳腺炎的关键调节因子和下游靶标。
J Appl Genet. 2019 Nov;60(3-4):367-373. doi: 10.1007/s13353-019-00499-7. Epub 2019 Jun 11.
3
A multi-objective gene clustering algorithm guided by apriori biological knowledge with intensification and diversification strategies.一种由先验生物学知识引导的多目标基因聚类算法,具备强化和多样化策略。
BioData Min. 2018 Aug 7;11:16. doi: 10.1186/s13040-018-0178-4. eCollection 2018.
4
Integration of machine learning and meta-analysis identifies the transcriptomic bio-signature of mastitis disease in cattle.机器学习与荟萃分析相结合,确定了奶牛乳腺炎疾病的转录组生物标志物。
PLoS One. 2018 Feb 22;13(2):e0191227. doi: 10.1371/journal.pone.0191227. eCollection 2018.
5
Identification of altered pathways in breast cancer based on individualized pathway aberrance score.基于个体化通路异常评分的乳腺癌中改变通路的鉴定。
Oncol Lett. 2017 Aug;14(2):1287-1294. doi: 10.3892/ol.2017.6292. Epub 2017 Jun 1.
6
Multiple criteria optimization joint analyses of microarray experiments in lung cancer: from existing microarray data to new knowledge.肺癌微阵列实验的多标准优化联合分析:从现有微阵列数据到新知识
Cancer Med. 2015 Dec;4(12):1884-900. doi: 10.1002/cam4.540. Epub 2015 Oct 16.
7
Systematic enrichment analysis of microRNA expression profiling studies in endometriosis.子宫内膜异位症中微小RNA表达谱研究的系统富集分析
Iran J Basic Med Sci. 2015 May;18(5):423-9.
8
Degree-adjusted algorithm for prioritisation of candidate disease genes from gene expression and protein interactome.基于基因表达和蛋白质互作网络的候选疾病基因优先级排序的校正度算法。
IET Syst Biol. 2014 Apr;8(2):41-6. doi: 10.1049/iet-syb.2013.0038.
9
Biomarker selection and classification of "-omics" data using a two-step bayes classification framework.基于两步贝叶斯分类框架的“组学”数据的生物标志物选择和分类。
Biomed Res Int. 2013;2013:148014. doi: 10.1155/2013/148014. Epub 2013 Sep 11.
10
CLASSIFYING PROCESSES: AN ESSAY IN APPLIED ONTOLOGY.对过程进行分类:应用本体论论文
Ratio (Oxf). 2012 Dec 1;25(4):463-488. doi: 10.1111/j.1467-9329.2012.00557.x.

本文引用的文献

1
Compact cancer biomarkers discovery using a swarm intelligence feature selection algorithm.利用群体智能特征选择算法发现紧凑型癌症生物标志物。
Comput Biol Chem. 2010 Aug;34(4):244-50. doi: 10.1016/j.compbiolchem.2010.08.003. Epub 2010 Sep 9.
2
SVM-RFE with MRMR filter for gene selection.基于 MRMR 滤波器的 SVM-RFE 基因选择方法。
IEEE Trans Nanobioscience. 2010 Mar;9(1):31-7. doi: 10.1109/TNB.2009.2035284. Epub 2009 Oct 30.
3
Advances in metaheuristics for gene selection and classification of microarray data.元启发式算法在基因选择和微阵列数据分析分类中的应用进展。
Brief Bioinform. 2010 Jan;11(1):127-41. doi: 10.1093/bib/bbp035. Epub 2009 Sep 29.
4
SDED: a novel filter method for cancer-related gene selection.SDED:一种用于癌症相关基因选择的新型过滤方法。
Bioinformation. 2008 Apr 11;2(7):301-3. doi: 10.6026/97320630002301.
5
A review of feature selection techniques in bioinformatics.生物信息学中特征选择技术综述。
Bioinformatics. 2007 Oct 1;23(19):2507-17. doi: 10.1093/bioinformatics/btm344. Epub 2007 Aug 24.
6
Gene extraction for cancer diagnosis by support vector machines--an improvement.用于癌症诊断的支持向量机基因提取——一项改进
Artif Intell Med. 2005 Sep-Oct;35(1-2):185-94. doi: 10.1016/j.artmed.2005.01.006.
7
Minimum redundancy feature selection from microarray gene expression data.从微阵列基因表达数据中进行最小冗余特征选择。
J Bioinform Comput Biol. 2005 Apr;3(2):185-205. doi: 10.1142/s0219720005001004.
8
A robust hybrid between genetic algorithm and support vector machine for extracting an optimal feature gene subset.一种用于提取最优特征基因子集的遗传算法与支持向量机的强大混合方法。
Genomics. 2005 Jan;85(1):16-23. doi: 10.1016/j.ygeno.2004.09.007.
9
HykGene: a hybrid approach for selecting marker genes for phenotype classification using microarray gene expression data.HykGene:一种利用微阵列基因表达数据选择用于表型分类的标记基因的混合方法。
Bioinformatics. 2005 Apr 15;21(8):1530-7. doi: 10.1093/bioinformatics/bti192. Epub 2004 Dec 7.
10
Filter versus wrapper gene selection approaches in DNA microarray domains.DNA微阵列领域中过滤法与包装法基因选择方法
Artif Intell Med. 2004 Jun;31(2):91-103. doi: 10.1016/j.artmed.2004.01.007.