• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

具有配对蛋白质组测量值的贝叶斯变量选择逻辑回归

Bayesian variable selection logistic regression with paired proteomic measurements.

作者信息

Kakourou Alexia, Mertens Bart

机构信息

Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, 2300, RC, Leiden, The Netherlands.

出版信息

Biom J. 2018 Sep;60(5):1003-1020. doi: 10.1002/bimj.201700182. Epub 2018 Jun 25.

DOI:10.1002/bimj.201700182
PMID:29943441
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6175404/
Abstract

We explore the problem of variable selection in a case-control setting with mass spectrometry proteomic data consisting of paired measurements. Each pair corresponds to a distinct isotope cluster and each component within pair represents a summary of isotopic expression based on either the intensity or the shape of the cluster. Our objective is to identify a collection of isotope clusters associated with the disease outcome and at the same time assess the predictive added-value of shape beyond intensity while maintaining predictive performance. We propose a Bayesian model that exploits the paired structure of our data and utilizes prior information on the relative predictive power of each source by introducing multiple layers of selection. This allows us to make simultaneous inference on which are the most informative pairs and for which-and to what extent-shape has a complementary value in separating the two groups. We evaluate the Bayesian model on pancreatic cancer data. Results from the fitted model show that most predictive potential is achieved with a subset of just six (out of 1289) pairs while the contribution of the intensity components is much higher than the shape components. To demonstrate how the method behaves under a controlled setting we consider a simulation study. Results from this study indicate that the proposed approach can successfully select the truly predictive pairs and accurately estimate the effects of both components although, in some cases, the model tends to overestimate the inclusion probability of the second component.

摘要

我们探讨了在病例对照研究中进行变量选择的问题,该研究使用的是由配对测量组成的质谱蛋白质组学数据。每一对对应一个独特的同位素簇,每一对中的每个成分代表基于簇的强度或形状的同位素表达汇总。我们的目标是识别与疾病结局相关的一组同位素簇,同时在保持预测性能的情况下,评估形状相对于强度的预测附加值。我们提出了一个贝叶斯模型,该模型利用了数据的配对结构,并通过引入多层选择来利用关于每个来源相对预测能力的先验信息。这使我们能够同时推断出哪些是最具信息性的对,以及形状在区分两组时对哪些对以及在何种程度上具有互补价值。我们在胰腺癌数据上评估了贝叶斯模型。拟合模型的结果表明,仅用六对(共1289对)的一个子集就能实现大部分预测潜力,而强度成分的贡献远高于形状成分。为了展示该方法在可控环境下的表现,我们进行了一项模拟研究。该研究的结果表明,尽管在某些情况下模型往往会高估第二个成分的包含概率,但所提出的方法能够成功选择真正具有预测性的对,并准确估计两个成分的效应。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f20b/6175404/3f0e6b9b3be5/BIMJ-60-1003-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f20b/6175404/8ceaa5c2ae82/BIMJ-60-1003-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f20b/6175404/1fbfee306f2e/BIMJ-60-1003-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f20b/6175404/a71ad58348ac/BIMJ-60-1003-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f20b/6175404/e39d106a6667/BIMJ-60-1003-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f20b/6175404/aa1ce2dd31aa/BIMJ-60-1003-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f20b/6175404/1a1ba80ef0f0/BIMJ-60-1003-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f20b/6175404/3ead831f62d5/BIMJ-60-1003-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f20b/6175404/264374ab5b18/BIMJ-60-1003-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f20b/6175404/3f0e6b9b3be5/BIMJ-60-1003-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f20b/6175404/8ceaa5c2ae82/BIMJ-60-1003-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f20b/6175404/1fbfee306f2e/BIMJ-60-1003-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f20b/6175404/a71ad58348ac/BIMJ-60-1003-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f20b/6175404/e39d106a6667/BIMJ-60-1003-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f20b/6175404/aa1ce2dd31aa/BIMJ-60-1003-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f20b/6175404/1a1ba80ef0f0/BIMJ-60-1003-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f20b/6175404/3ead831f62d5/BIMJ-60-1003-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f20b/6175404/264374ab5b18/BIMJ-60-1003-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f20b/6175404/3f0e6b9b3be5/BIMJ-60-1003-g009.jpg

相似文献

1
Bayesian variable selection logistic regression with paired proteomic measurements.具有配对蛋白质组测量值的贝叶斯变量选择逻辑回归
Biom J. 2018 Sep;60(5):1003-1020. doi: 10.1002/bimj.201700182. Epub 2018 Jun 25.
2
Part 1. Statistical Learning Methods for the Effects of Multiple Air Pollution Constituents.第1部分. 多种空气污染成分影响的统计学习方法
Res Rep Health Eff Inst. 2015 Jun(183 Pt 1-2):5-50.
3
Prediction models for clustered data with informative priors for the random effects: a simulation study.具有信息先验的随机效应聚集数据的预测模型:一项模拟研究。
BMC Med Res Methodol. 2018 Aug 6;18(1):83. doi: 10.1186/s12874-018-0543-5.
4
Fast approximate inference for variable selection in Dirichlet process mixtures, with an application to pan-cancer proteomics.狄利克雷过程混合模型中用于变量选择的快速近似推断及其在泛癌蛋白质组学中的应用
Stat Appl Genet Mol Biol. 2019 Dec 12;18(6):/j/sagmb.2019.18.issue-6/sagmb-2018-0065/sagmb-2018-0065.xml. doi: 10.1515/sagmb-2018-0065.
5
Bayesian One-Sided Variable Selection.贝叶斯单侧变量选择。
Multivariate Behav Res. 2022 Mar-May;57(2-3):264-278. doi: 10.1080/00273171.2020.1813067. Epub 2020 Sep 1.
6
Accounting for isotopic clustering in Fourier transform mass spectrometry data analysis for clinical diagnostic studies.在临床诊断研究的傅里叶变换质谱数据分析中考虑同位素聚类。
Stat Appl Genet Mol Biol. 2016 Oct 1;15(5):415-430. doi: 10.1515/sagmb-2016-0005.
7
Bayesian methods of analysis for cluster randomized trials with binary outcome data.用于二元结局数据的整群随机试验的贝叶斯分析方法。
Stat Med. 2001 Feb 15;20(3):453-72. doi: 10.1002/1097-0258(20010215)20:3<453::aid-sim803>3.0.co;2-l.
8
Bayesian regression models outperform partial least squares methods for predicting milk components and technological properties using infrared spectral data.在使用红外光谱数据预测牛奶成分和工艺特性方面,贝叶斯回归模型优于偏最小二乘法。
J Dairy Sci. 2015 Nov;98(11):8133-51. doi: 10.3168/jds.2014-9143. Epub 2015 Sep 18.
9
A multi-model statistical approach for proteomic spectral count quantitation.一种用于蛋白质组学光谱计数定量的多模型统计方法。
J Proteomics. 2016 Jul 20;144:23-32. doi: 10.1016/j.jprot.2016.05.032. Epub 2016 May 31.
10
Multivariate Bayesian variable selection exploiting dependence structure among outcomes: Application to air pollution effects on DNA methylation.利用结果间依赖结构的多变量贝叶斯变量选择:在空气污染对DNA甲基化影响中的应用
Biometrics. 2017 Mar;73(1):232-241. doi: 10.1111/biom.12557. Epub 2016 Jul 5.

引用本文的文献

1
Formononetin derivatives containing benzyl piperidine: A brand new, highly efficient inhibitor targeting Xanthomonas spp.含苄基哌啶的大豆黄素衍生物:一种全新的、高效靶向黄单胞菌属的抑制剂
J Adv Res. 2025 Jul;73:133-146. doi: 10.1016/j.jare.2024.08.039. Epub 2024 Sep 2.

本文引用的文献

1
Accounting for isotopic clustering in Fourier transform mass spectrometry data analysis for clinical diagnostic studies.在临床诊断研究的傅里叶变换质谱数据分析中考虑同位素聚类。
Stat Appl Genet Mol Biol. 2016 Oct 1;15(5):415-430. doi: 10.1515/sagmb-2016-0005.
2
Testing the additional predictive value of high-dimensional molecular data.测试高维分子数据的额外预测价值。
BMC Bioinformatics. 2010 Feb 8;11:78. doi: 10.1186/1471-2105-11-78.