• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于概率判别偏最小二乘法和剔除选项的微阵列数据分析分类。

Classification from microarray data using probabilistic discriminant partial least squares with reject option.

机构信息

Department of Analytical Chemistry and Organic Chemistry, Rovira i Virgili University, 43007 Tarragona, Spain.

出版信息

Talanta. 2009 Nov 15;80(1):321-8. doi: 10.1016/j.talanta.2009.06.072. Epub 2009 Jul 7.

DOI:10.1016/j.talanta.2009.06.072
PMID:19782232
Abstract

Microarrays are used to simultaneously determine the expressions of thousands of genes. An important application of microarrays is in the classification of samples into classes of interest (e.g. either healthy cells or tumour cells). Discriminant partial least squares (DPLS) has often been used for this purpose. In this paper, we describe an improvement to DPLS that uses kernel-based probability density functions and the Bayes rule to classify samples whilst keeping the option of not classifying the sample if this cannot be done with sufficient confidence. With this approach, those samples outside the boundaries of the known classes or from the ambiguity region between classes are rejected and only samples with a high probability of being correctly classified are indeed classified. The optimal model is found by simultaneously minimizing the misclassification and rejection costs. The method (p-DPLS with reject option) was tested with two datasets. For the human cancers dataset the accuracy (obtained by leave-one-out cross-validation) was improved from 97% to 99% when compared to p-DPLS without reject option. For the breast cancer dataset, p-DPLS with reject option was able to reject 100% of the test samples that did not belong to any of the modelled classes. These samples would have been misclassified if the reject option had not been considered.

摘要

微阵列被用于同时测定数千个基因的表达。微阵列的一个重要应用是将样本分类到感兴趣的类别中(例如,健康细胞或肿瘤细胞)。判别偏最小二乘法(DPLS)经常被用于此目的。在本文中,我们描述了一种对 DPLS 的改进,该改进使用基于核的概率密度函数和贝叶斯规则对样本进行分类,同时保留了如果没有足够的置信度就不进行分类的选项。通过这种方法,那些位于已知类别边界之外或类别之间的模糊区域之外的样本将被拒绝,只有那些具有高概率被正确分类的样本才会被真正分类。通过同时最小化误分类和拒绝成本来找到最优模型。该方法(带拒绝选项的 p-DPLS)使用两个数据集进行了测试。对于人类癌症数据集,与不带拒绝选项的 p-DPLS 相比,通过留一法交叉验证获得的准确性从 97%提高到了 99%。对于乳腺癌数据集,带拒绝选项的 p-DPLS 能够拒绝属于任何已建模类别之外的测试样本的 100%。如果不考虑拒绝选项,这些样本将被错误分类。

相似文献

1
Classification from microarray data using probabilistic discriminant partial least squares with reject option.基于概率判别偏最小二乘法和剔除选项的微阵列数据分析分类。
Talanta. 2009 Nov 15;80(1):321-8. doi: 10.1016/j.talanta.2009.06.072. Epub 2009 Jul 7.
2
Optimal approach for classification of acute leukemia subtypes based on gene expression data.基于基因表达数据的急性白血病亚型分类的优化方法。
Biotechnol Prog. 2002 Jul-Aug;18(4):847-54. doi: 10.1021/bp025517o.
3
Independent component analysis-based penalized discriminant method for tumor classification using gene expression data.基于独立成分分析的惩罚判别方法用于利用基因表达数据进行肿瘤分类
Bioinformatics. 2006 Aug 1;22(15):1855-62. doi: 10.1093/bioinformatics/btl190. Epub 2006 May 18.
4
Challenges in projecting clustering results across gene expression-profiling datasets.跨基因表达谱数据集预测聚类结果面临的挑战。
J Natl Cancer Inst. 2007 Nov 21;99(22):1715-23. doi: 10.1093/jnci/djm216. Epub 2007 Nov 13.
5
Variable selection using probability density function similarity for support vector machine classification of high-dimensional microarray data.利用概率密度函数相似度进行变量选择以用于高维微阵列数据的支持向量机分类
Talanta. 2009 Jul 15;79(2):260-7. doi: 10.1016/j.talanta.2009.03.044. Epub 2009 Mar 31.
6
Gene selection and classification from microarray data using kernel machine.使用核机器从微阵列数据中进行基因选择和分类。
FEBS Lett. 2004 Jul 30;571(1-3):93-8. doi: 10.1016/j.febslet.2004.05.087.
7
Reliable gene signatures for microarray classification: assessment of stability and performance.用于微阵列分类的可靠基因特征:稳定性和性能评估
Bioinformatics. 2006 Oct 1;22(19):2356-63. doi: 10.1093/bioinformatics/btl400. Epub 2006 Jul 31.
8
Multi-class classification with probabilistic discriminant partial least squares (p-DPLS).基于概率判别偏最小二乘法(p-DPLS)的多类分类。
Anal Chim Acta. 2010 Apr 1;664(1):27-33. doi: 10.1016/j.aca.2010.01.059. Epub 2010 Feb 6.
9
Simple discriminant functions identify small sets of genes that distinguish cancer phenotype from normal.简单判别函数可识别出一小部分能区分癌症表型与正常表型的基因。
Genome Inform. 2005;16(1):245-53.
10
Classification with reject option in gene expression data.基因表达数据中带拒绝选项的分类
Bioinformatics. 2008 Sep 1;24(17):1889-95. doi: 10.1093/bioinformatics/btn349. Epub 2008 Jul 10.

引用本文的文献

1
Global Lipidome Profiling Revealed Multifaceted Role of Lipid Species in Hepatitis C Virus Replication, Assembly, and Host Antiviral Response.全球脂质组学分析揭示了脂质种类在丙型肝炎病毒复制、组装和宿主抗病毒反应中的多方面作用。
Viruses. 2023 Feb 7;15(2):464. doi: 10.3390/v15020464.
2
So you think you can PLS-DA?那么,你认为你可以进行 PLS-DA 分析吗?
BMC Bioinformatics. 2020 Dec 9;21(Suppl 1):2. doi: 10.1186/s12859-019-3310-7.