• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于分析序列计数和甲基化数据的贝叶斯负二项混合回归模型。

Bayesian negative binomial mixture regression models for the analysis of sequence count and methylation data.

作者信息

Li Qiwei, Cassese Alberto, Guindani Michele, Vannucci Marina

机构信息

Department of Clinical Sciences, University of Texas Southwestern Medical Center, Dallas, Texas, U.S.A.

Department of Methodology and Statistics, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands.

出版信息

Biometrics. 2019 Mar;75(1):183-192. doi: 10.1111/biom.12962. Epub 2018 Sep 19.

DOI:10.1111/biom.12962
PMID:30125947
Abstract

In this article, we develop a Bayesian hierarchical mixture regression model for studying the association between a multivariate response, measured as counts on a set of features, and a set of covariates. We have available RNA-Seq and DNA methylation data measured on breast cancer patients at different stages of the disease. We account for the heterogeneity and over-dispersion of count data (here, RNA-Seq data) by considering a mixture of negative binomial distributions and incorporate the covariates (here, methylation data) into the model via a linear modeling construction on the mean components. Our modeling construction includes several innovative characteristics. First, it employs selection techniques that allow the identification of a small subset of features that best discriminate the samples while simultaneously selecting a set of covariates associated to each feature. Second, it incorporates known dependencies into the feature selection process via the use of Markov random field (MRF) priors. On simulated data, we show how incorporating existing information via the prior model can improve the accuracy of feature selection. In the analysis of RNA-Seq and DNA methylation data on breast cancer, we incorporate knowledge on relationships among genes via a gene-gene network, which we extract from the KEGG database. Our data analysis identifies genes which are discriminatory of cancer stages and simultaneously selects significant associations between those genes and DNA methylation sites. A biological interpretation of our findings reveals several biomarkers that can help understanding the effect of DNA methylation on gene expression transcription across cancer stages.

摘要

在本文中,我们开发了一种贝叶斯分层混合回归模型,用于研究以一组特征计数衡量的多变量响应与一组协变量之间的关联。我们有在乳腺癌患者疾病不同阶段测量的RNA测序和DNA甲基化数据。我们通过考虑负二项分布的混合来处理计数数据(此处为RNA测序数据)的异质性和过度离散,并通过对均值成分进行线性建模结构将协变量(此处为甲基化数据)纳入模型。我们的建模结构包括几个创新特征。首先,它采用选择技术,能够识别最能区分样本的一小部分特征,同时选择与每个特征相关的一组协变量。其次,它通过使用马尔可夫随机场(MRF)先验将已知的依赖性纳入特征选择过程。在模拟数据上,我们展示了通过先验模型纳入现有信息如何提高特征选择的准确性。在对乳腺癌的RNA测序和DNA甲基化数据分析中,我们通过从KEGG数据库提取的基因-基因网络纳入基因之间关系的知识。我们的数据分析识别出区分癌症阶段的基因,并同时选择这些基因与DNA甲基化位点之间的显著关联。我们研究结果的生物学解释揭示了几种生物标志物,有助于理解DNA甲基化在癌症各阶段对基因表达转录的影响。

相似文献

1
Bayesian negative binomial mixture regression models for the analysis of sequence count and methylation data.用于分析序列计数和甲基化数据的贝叶斯负二项混合回归模型。
Biometrics. 2019 Mar;75(1):183-192. doi: 10.1111/biom.12962. Epub 2018 Sep 19.
2
A sparse negative binomial mixture model for clustering RNA-seq count data.一种用于对RNA测序计数数据进行聚类的稀疏负二项混合模型。
Biostatistics. 2022 Dec 12;24(1):68-84. doi: 10.1093/biostatistics/kxab025.
3
A variational Bayes beta mixture model for feature selection in DNA methylation studies.用于DNA甲基化研究中特征选择的变分贝叶斯贝塔混合模型。
J Bioinform Comput Biol. 2013 Aug;11(4):1350005. doi: 10.1142/S0219720013500054. Epub 2013 Mar 18.
4
miRNA-target gene regulatory networks: A Bayesian integrative approach to biomarker selection with application to kidney cancer.微小RNA-靶基因调控网络:一种用于生物标志物选择的贝叶斯整合方法及其在肾癌中的应用
Biometrics. 2015 Jun;71(2):428-38. doi: 10.1111/biom.12266. Epub 2015 Jan 30.
5
Variable selection for discriminant analysis with Markov random field priors for the analysis of microarray data.基于马尔可夫随机场先验的判别分析的变量选择在微阵列数据分析中的应用。
Bioinformatics. 2011 Feb 15;27(4):495-501. doi: 10.1093/bioinformatics/btq690. Epub 2010 Dec 14.
6
QNB: differential RNA methylation analysis for count-based small-sample sequencing data with a quad-negative binomial model.QNB:基于四重负二项式模型的计数型小样本测序数据的差异RNA甲基化分析
BMC Bioinformatics. 2017 Aug 31;18(1):387. doi: 10.1186/s12859-017-1808-4.
7
DM-BLD: differential methylation detection using a hierarchical Bayesian model exploiting local dependency.DM-BLD:使用利用局部依赖性的分层贝叶斯模型进行差异甲基化检测。
Bioinformatics. 2017 Jan 15;33(2):161-168. doi: 10.1093/bioinformatics/btw596. Epub 2016 Sep 11.
8
Part 1. Statistical Learning Methods for the Effects of Multiple Air Pollution Constituents.第1部分. 多种空气污染成分影响的统计学习方法
Res Rep Health Eff Inst. 2015 Jun(183 Pt 1-2):5-50.
9
Bivariate zero-inflated regression for count data: a Bayesian approach with application to plant counts.计数数据的双变量零膨胀回归:一种贝叶斯方法及其在植物计数中的应用
Int J Biostat. 2010;6(1):Article 27. doi: 10.2202/1557-4679.1229.
10
Multivariate Bayesian variable selection exploiting dependence structure among outcomes: Application to air pollution effects on DNA methylation.利用结果间依赖结构的多变量贝叶斯变量选择:在空气污染对DNA甲基化影响中的应用
Biometrics. 2017 Mar;73(1):232-241. doi: 10.1111/biom.12557. Epub 2016 Jul 5.

引用本文的文献

1
Poisson Beta Regression for Count Data With an Application to Hospital Length of Stay Data.用于计数数据的泊松贝塔回归及其在住院时间数据中的应用
Stat Med. 2025 Aug;44(18-19):e70217. doi: 10.1002/sim.70217.
2
A review of model evaluation metrics for machine learning in genetics and genomics.遗传学和基因组学中机器学习模型评估指标综述。
Front Bioinform. 2024 Sep 10;4:1457619. doi: 10.3389/fbinf.2024.1457619. eCollection 2024.
3
NetMIM: network-based multi-omics integration with block missingness for biomarker selection and disease outcome prediction.
NetMIM:基于网络的多组学整合,具有块缺失,用于生物标志物选择和疾病结果预测。
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae454.
4
An interpretable Bayesian clustering approach with feature selection for analyzing spatially resolved transcriptomics data.一种具有特征选择的可解释贝叶斯聚类方法,用于分析空间分辨转录组学数据。
Biometrics. 2024 Jul 1;80(3). doi: 10.1093/biomtc/ujae066.
5
Reconstructing Spatial Transcriptomics at the Single-cell Resolution with BayesDeep.使用BayesDeep在单细胞分辨率下重建空间转录组学
bioRxiv. 2023 Dec 8:2023.12.07.570715. doi: 10.1101/2023.12.07.570715.
6
Correlates of injection-related wounds and skin infections amongst persons who inject drugs and use a syringe service programme: A single center study.注射相关伤口和皮肤感染与注射吸毒者和使用 syringe service programme 的相关性:一项单中心研究。
Int Wound J. 2021 Oct;18(5):701-707. doi: 10.1111/iwj.13572. Epub 2021 Feb 15.
7
HARMONIES: A Hybrid Approach for Microbiome Networks Inference via Exploiting Sparsity.和谐:一种通过利用稀疏性进行微生物组网络推断的混合方法。
Front Genet. 2020 Jun 3;11:445. doi: 10.3389/fgene.2020.00445. eCollection 2020.