• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于微阵列数据的拉普拉斯朴素贝叶斯模型均值收缩的生物标志物识别和癌症分类。

Biomarker identification and cancer classification based on microarray data using Laplace naive Bayes model with mean shrinkage.

机构信息

Center for Computer Vision and Department of Mathematics, Sun Yat-Sen University,Guangzhou 510275, China.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2012 Nov-Dec;9(6):1649-62. doi: 10.1109/TCBB.2012.105.

DOI:10.1109/TCBB.2012.105
PMID:22868679
Abstract

Biomarker identification and cancer classification are two closely related problems. In gene expression data sets, the correlation between genes can be high when they share the same biological pathway. Moreover, the gene expression data sets may contain outliers due to either chemical or electrical reasons. A good gene selection method should take group effects into account and be robust to outliers. In this paper, we propose a Laplace naive Bayes model with mean shrinkage (LNB-MS). The Laplace distribution instead of the normal distribution is used as the conditional distribution of the samples for the reasons that it is less sensitive to outliers and has been applied in many fields. The key technique is the L1 penalty imposed on the mean of each class to achieve automatic feature selection. The objective function of the proposed model is a piecewise linear function with respect to the mean of each class, of which the optimal value can be evaluated at the breakpoints simply. An efficient algorithm is designed to estimate the parameters in the model. A new strategy that uses the number of selected features to control the regularization parameter is introduced. Experimental results on simulated data sets and 17 publicly available cancer data sets attest to the accuracy, sparsity, efficiency, and robustness of the proposed algorithm. Many biomarkers identified with our method have been verified in biochemical or biomedical research. The analysis of biological and functional correlation of the genes based on Gene Ontology (GO) terms shows that the proposed method guarantees the selection of highly correlated genes simultaneously

摘要

生物标志物的识别和癌症分类是两个密切相关的问题。在基因表达数据集,当基因共享相同的生物途径时,它们之间的相关性可能很高。此外,由于化学或电气原因,基因表达数据集可能包含异常值。一个好的基因选择方法应该考虑到组效应并且对异常值具有鲁棒性。在本文中,我们提出了一种具有均值收缩的拉普拉斯朴素贝叶斯模型(LNB-MS)。之所以选择拉普拉斯分布而不是正态分布作为样本的条件分布,是因为它对异常值的敏感性较低,并且已经在许多领域得到了应用。关键技术是对每个类别的均值施加 L1 惩罚,以实现自动特征选择。所提出模型的目标函数是关于每个类别的均值的分段线性函数,其最优值可以在断点处简单地评估。设计了一种有效的算法来估计模型中的参数。引入了一种使用所选特征的数量来控制正则化参数的新策略。在模拟数据集和 17 个公开可用的癌症数据集上的实验结果证明了所提出算法的准确性、稀疏性、效率和鲁棒性。我们的方法识别出的许多生物标志物已经在生化或生物医学研究中得到了验证。基于基因本体论(GO)术语对基因的生物和功能相关性的分析表明,该方法可以保证同时选择高度相关的基因。

相似文献

1
Biomarker identification and cancer classification based on microarray data using Laplace naive Bayes model with mean shrinkage.基于微阵列数据的拉普拉斯朴素贝叶斯模型均值收缩的生物标志物识别和癌症分类。
IEEE/ACM Trans Comput Biol Bioinform. 2012 Nov-Dec;9(6):1649-62. doi: 10.1109/TCBB.2012.105.
2
Gene selection in cancer classification using sparse logistic regression with Bayesian regularization.使用带贝叶斯正则化的稀疏逻辑回归进行癌症分类中的基因选择。
Bioinformatics. 2006 Oct 1;22(19):2348-55. doi: 10.1093/bioinformatics/btl386. Epub 2006 Jul 14.
3
A centroid-based gene selection method for microarray data classification.一种基于质心的微阵列数据分类基因选择方法。
J Theor Biol. 2016 Jul 7;400:32-41. doi: 10.1016/j.jtbi.2016.03.034. Epub 2016 Apr 4.
4
Gene selection for microarray gene expression classification using Bayesian Lasso quantile regression.基于贝叶斯 Lasso 分位数回归的基因表达谱微阵列基因选择用于分类。
Comput Biol Med. 2018 Jun 1;97:145-152. doi: 10.1016/j.compbiomed.2018.04.018. Epub 2018 Apr 27.
5
Kernel-imbedded Gaussian processes for disease classification using microarray gene expression data.使用微阵列基因表达数据的用于疾病分类的核嵌入高斯过程。
BMC Bioinformatics. 2007 Feb 28;8:67. doi: 10.1186/1471-2105-8-67.
6
Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data.贝叶斯模型平均法:一种用于微阵列数据的改进型多类别、基因选择及分类工具的开发
Bioinformatics. 2005 May 15;21(10):2394-402. doi: 10.1093/bioinformatics/bti319. Epub 2005 Feb 15.
7
Cancer classification from gene expression data by NPPC ensemble.基于 NPPC 集成的基因表达数据的癌症分类。
IEEE/ACM Trans Comput Biol Bioinform. 2011 May-Jun;8(3):659-71. doi: 10.1109/TCBB.2010.36.
8
An efficient statistical feature selection approach for classification of gene expression data.一种用于基因表达数据分类的高效统计特征选择方法。
J Biomed Inform. 2011 Aug;44(4):529-35. doi: 10.1016/j.jbi.2011.01.001. Epub 2011 Jan 15.
9
Cancer classification and prediction using logistic regression with Bayesian gene selection.使用贝叶斯基因选择的逻辑回归进行癌症分类和预测。
J Biomed Inform. 2004 Aug;37(4):249-59. doi: 10.1016/j.jbi.2004.07.009.
10
A GMM-IG framework for selecting genes as expression panel biomarkers.一种用于选择基因作为表达谱生物标志物的 GMM-IG 框架。
Artif Intell Med. 2010 Feb-Mar;48(2-3):75-82. doi: 10.1016/j.artmed.2009.07.006. Epub 2009 Dec 8.

引用本文的文献

1
Folded concave penalized learning of high-dimensional MRI data in Parkinson's disease.帕金森病高维 MRI 数据的折叠凹惩罚学习。
J Neurosci Methods. 2021 Jun 1;357:109157. doi: 10.1016/j.jneumeth.2021.109157. Epub 2021 Mar 26.
2
StressGenePred: a twin prediction model architecture for classifying the stress types of samples and discovering stress-related genes in arabidopsis.StressGenePred:一种用于对样本的应激类型进行分类和发现拟南芥中与应激相关基因的双胞胎预测模型架构。
BMC Genomics. 2019 Dec 20;20(Suppl 11):949. doi: 10.1186/s12864-019-6283-z.
3
An elastic-net logistic regression approach to generate classifiers and gene signatures for types of immune cells and T helper cell subsets.
基于弹性网络逻辑回归的方法生成免疫细胞类型和 T 辅助细胞亚群的分类器和基因特征。
BMC Bioinformatics. 2019 Aug 22;20(1):433. doi: 10.1186/s12859-019-2994-z.
4
Integration of 24 Feature Types to Accurately Detect and Predict Seizures Using Scalp EEG Signals.整合 24 种特征类型,使用头皮 EEG 信号准确检测和预测癫痫发作。
Sensors (Basel). 2018 Apr 28;18(5):1372. doi: 10.3390/s18051372.
5
Identification of genes associated with renal cell carcinoma using gene expression profiling analysis.利用基因表达谱分析鉴定与肾细胞癌相关的基因
Oncol Lett. 2016 Jul;12(1):73-78. doi: 10.3892/ol.2016.4573. Epub 2016 May 16.
6
Folded concave penalized learning in identifying multimodal MRI marker for Parkinson's disease.用于识别帕金森病多模态磁共振成像标志物的折叠凹惩罚学习
J Neurosci Methods. 2016 Aug 1;268:1-6. doi: 10.1016/j.jneumeth.2016.04.016. Epub 2016 Apr 19.
7
Hierarchical gene selection and genetic fuzzy system for cancer microarray data classification.用于癌症微阵列数据分类的分层基因选择与遗传模糊系统
PLoS One. 2015 Mar 30;10(3):e0120364. doi: 10.1371/journal.pone.0120364. eCollection 2015.
8
Cancer subtype discovery and biomarker identification via a new robust network clustering algorithm.通过一种新的稳健网络聚类算法发现癌症亚型和鉴定生物标志物。
PLoS One. 2013 Jun 17;8(6):e66256. doi: 10.1371/journal.pone.0066256. Print 2013.