• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于经验贝叶斯的微阵列数据分析推断和模型诊断。

β-empirical Bayes inference and model diagnosis of microarray data.

机构信息

Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-ku, Tokyo 113-8657, Japan.

出版信息

BMC Bioinformatics. 2012 Jun 19;13:135. doi: 10.1186/1471-2105-13-135.

DOI:10.1186/1471-2105-13-135
PMID:22713095
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3464654/
Abstract

BACKGROUND

Microarray data enables the high-throughput survey of mRNA expression profiles at the genomic level; however, he data presents a challenging statistical problem because of the large number of transcripts with small sample sizes that are obtained. To reduce the dimensionality, various Bayesian or empirical Bayes hierarchical models have been developed. However, because of the complexity of the microarray data, no model can explain the data fully. It is generally difficult to scrutinize the irregular patterns of expression that are not expected by the usual statistical gene by gene models.

RESULTS

As an extension of empirical Bayes (EB) procedures, we have developed the β-empirical Bayes (β-EB) approach based on a β-likelihood measure which can be regarded as an 'evidence-based' weighted (quasi-) likelihood inference. The weight of a transcript t is described as a power function of its likelihood, fβ(yt|θ). Genes with low likelihoods have unexpected expression patterns and low weights. By assigning low weights to outliers, the inference becomes robust. The value of β, which controls the balance between the robustness and efficiency, is selected by maximizing the predictive β₀-likelihood by cross-validation. The proposed β-EB approach identified six significant (p<10⁻⁵) contaminated transcripts as differentially expressed (DE) in normal/tumor tissues from the head and neck of cancer patients. These six genes were all confirmed to be related to cancer; they were not identified as DE genes by the classical EB approach. When applied to the eQTL analysis of Arabidopsis thaliana, the proposed β-EB approach identified some potential master regulators that were missed by the EB approach.

CONCLUSIONS

The simulation data and real gene expression data showed that the proposed β-EB method was robust against outliers. The distribution of the weights was used to scrutinize the irregular patterns of expression and diagnose the model statistically. When β-weights outside the range of the predicted distribution were observed, a detailed inspection of the data was carried out. The β-weights described here can be applied to other likelihood-based statistical models for diagnosis, and may serve as a useful tool for transcriptome and proteome studies.

摘要

背景

微阵列数据使我们能够在基因组水平上高通量地检测 mRNA 表达谱; 然而,由于获得的转录本数量众多,样本量小,数据呈现出具有挑战性的统计问题。为了降低维度,已经开发了各种贝叶斯或经验贝叶斯层次模型。然而,由于微阵列数据的复杂性,没有模型可以完全解释数据。通常很难仔细检查通常的基因统计模型所不期望的表达模式。

结果

作为经验贝叶斯 (EB) 过程的扩展,我们基于β似然度量开发了β-经验贝叶斯 (β-EB) 方法,该方法可以看作是一种“基于证据”的加权(准)似然推理。转录本 t 的权重描述为其似然性 fβ(yt|θ) 的幂函数。具有低似然度的基因具有意外的表达模式和低权重。通过向异常值分配低权重,推理变得稳健。通过交叉验证最大化预测β₀似然度来选择控制稳健性和效率之间平衡的β值。所提出的β-EB 方法确定了 6 个显著的(p<10⁻⁵)污染转录本作为头颈癌患者正常/肿瘤组织中的差异表达(DE)。这 6 个基因都被证实与癌症有关;它们没有被经典 EB 方法鉴定为 DE 基因。当应用于拟南芥的 eQTL 分析时,所提出的β-EB 方法鉴定了一些被 EB 方法遗漏的潜在主调控因子。

结论

模拟数据和真实基因表达数据表明,所提出的β-EB 方法对异常值具有稳健性。权重的分布用于仔细检查表达的不规则模式并从统计学上诊断模型。当观察到预测分布范围之外的β权重时,对数据进行详细检查。此处描述的β权重可应用于其他基于似然的统计模型进行诊断,并且可以作为转录组和蛋白质组研究的有用工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0f9/3464654/ed894239c59d/1471-2105-13-135-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0f9/3464654/cd666f9a31b2/1471-2105-13-135-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0f9/3464654/fc5f39936322/1471-2105-13-135-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0f9/3464654/aeb516099243/1471-2105-13-135-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0f9/3464654/a5b3f62f3936/1471-2105-13-135-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0f9/3464654/f1be89ffdaf9/1471-2105-13-135-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0f9/3464654/ed894239c59d/1471-2105-13-135-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0f9/3464654/cd666f9a31b2/1471-2105-13-135-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0f9/3464654/fc5f39936322/1471-2105-13-135-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0f9/3464654/aeb516099243/1471-2105-13-135-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0f9/3464654/a5b3f62f3936/1471-2105-13-135-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0f9/3464654/f1be89ffdaf9/1471-2105-13-135-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0f9/3464654/ed894239c59d/1471-2105-13-135-6.jpg

相似文献

1
β-empirical Bayes inference and model diagnosis of microarray data.基于经验贝叶斯的微阵列数据分析推断和模型诊断。
BMC Bioinformatics. 2012 Jun 19;13:135. doi: 10.1186/1471-2105-13-135.
2
Weighted lasso in graphical Gaussian modeling for large gene network estimation based on microarray data.基于微阵列数据的大型基因网络估计的图形高斯建模中的加权套索法
Genome Inform. 2007;19:142-53.
3
Prior robust empirical Bayes inference for large-scale data by conditioning on rank with application to microarray data.通过对秩进行条件化处理对大规模数据进行先前稳健的经验贝叶斯推断,并将其应用于微阵列数据。
Biostatistics. 2014 Jan;15(1):60-73. doi: 10.1093/biostatistics/kxt026. Epub 2013 Aug 8.
4
Intensity-based hierarchical Bayes method improves testing for differentially expressed genes in microarray experiments.基于强度的分层贝叶斯方法改进了微阵列实验中差异表达基因的检测。
BMC Bioinformatics. 2006 Dec 19;7:538. doi: 10.1186/1471-2105-7-538.
5
Assessing differential expression in two-color microarrays: a resampling-based empirical Bayes approach.评估双色微阵列中的差异表达:基于重采样的经验贝叶斯方法。
PLoS One. 2013 Nov 27;8(11):e80099. doi: 10.1371/journal.pone.0080099. eCollection 2013.
6
Confident difference criterion: a new Bayesian differentially expressed gene selection algorithm with applications.置信差异准则:一种新的贝叶斯差异表达基因选择算法及其应用
BMC Bioinformatics. 2015 Aug 7;16:245. doi: 10.1186/s12859-015-0664-3.
7
An empirical Bayes' approach to joint analysis of multiple microarray gene expression studies.一种用于多个微阵列基因表达研究联合分析的经验贝叶斯方法。
Biometrics. 2011 Dec;67(4):1617-26. doi: 10.1111/j.1541-0420.2011.01602.x. Epub 2011 Apr 22.
8
Empirical Bayes screening of many p-values with applications to microarray studies.用于微阵列研究的多p值经验贝叶斯筛选。
Bioinformatics. 2005 May 1;21(9):1987-94. doi: 10.1093/bioinformatics/bti301. Epub 2005 Feb 2.
9
A Robust Approach for Identification of Cancer Biomarkers and Candidate Drugs.一种稳健的癌症生物标志物和候选药物鉴定方法。
Medicina (Kaunas). 2019 Jun 11;55(6):269. doi: 10.3390/medicina55060269.
10
An empirical Bayes approach to inferring large-scale gene association networks.一种用于推断大规模基因关联网络的经验贝叶斯方法。
Bioinformatics. 2005 Mar;21(6):754-64. doi: 10.1093/bioinformatics/bti062. Epub 2004 Oct 12.

引用本文的文献

1
Robust volcano plot: identification of differential metabolites in the presence of outliers.稳健火山图:在存在离群值的情况下鉴定差异代谢物。
BMC Bioinformatics. 2018 Apr 11;19(1):128. doi: 10.1186/s12859-018-2117-2.
2
Robustification of Naïve Bayes Classifier and Its Application for Microarray Gene Expression Data Analysis.朴素贝叶斯分类器的稳健化及其在基因表达数据分析中的应用。
Biomed Res Int. 2017;2017:3020627. doi: 10.1155/2017/3020627. Epub 2017 Aug 7.
3
Robust Significance Analysis of Microarrays by Minimum -Divergence Method.

本文引用的文献

1
On differential gene expression using RNA-Seq data.关于使用RNA测序数据进行差异基因表达分析
Cancer Inform. 2011;10:205-15. doi: 10.4137/CIN.S7473. Epub 2011 Aug 1.
2
From sets to graphs: towards a realistic enrichment analysis of transcriptomic systems.从集合到图:走向转录组系统的现实富集分析。
Bioinformatics. 2011 Jul 1;27(13):i366-73. doi: 10.1093/bioinformatics/btr228.
3
Non-parametric change-point method for differential gene expression detection.非参数变化点方法用于差异基因表达检测。
基于最小散度法的芯片数据稳健显著性分析
Biomed Res Int. 2017;2017:5310198. doi: 10.1155/2017/5310198. Epub 2017 Jul 27.
4
A 19-Gene expression signature as a predictor of survival in colorectal cancer.一种19基因表达特征作为结直肠癌生存的预测指标。
BMC Med Genomics. 2016 Sep 8;9(1):58. doi: 10.1186/s12920-016-0218-1.
5
A Hybrid One-Way ANOVA Approach for the Robust and Efficient Estimation of Differential Gene Expression with Multiple Patterns.一种用于稳健高效估计具有多种模式的差异基因表达的混合单向方差分析方法。
PLoS One. 2015 Sep 28;10(9):e0138810. doi: 10.1371/journal.pone.0138810. eCollection 2015.
PLoS One. 2011;6(5):e20060. doi: 10.1371/journal.pone.0020060. Epub 2011 May 31.
4
Proportion statistics to detect differentially expressed genes: a comparison with log-ratio statistics.比例统计检测差异表达基因:与对数比统计的比较。
BMC Bioinformatics. 2011 Jun 7;12:228. doi: 10.1186/1471-2105-12-228.
5
Quantitative, high-resolution proteomics for data-driven systems biology.基于数据驱动的系统生物学的定量、高分辨率蛋白质组学。
Annu Rev Biochem. 2011;80:273-99. doi: 10.1146/annurev-biochem-061308-093216.
6
An empirical Bayes' approach to joint analysis of multiple microarray gene expression studies.一种用于多个微阵列基因表达研究联合分析的经验贝叶斯方法。
Biometrics. 2011 Dec;67(4):1617-26. doi: 10.1111/j.1541-0420.2011.01602.x. Epub 2011 Apr 22.
7
A novel approach to the clustering of microarray data via nonparametric density estimation.一种基于非参数密度估计的微阵列数据聚类新方法。
BMC Bioinformatics. 2011 Feb 8;12:49. doi: 10.1186/1471-2105-12-49.
8
Multivariate analysis of microarray data: differential expression and differential connection.多变量分析微阵列数据:差异表达和差异连接。
BMC Bioinformatics. 2011 Feb 1;12:42. doi: 10.1186/1471-2105-12-42.
9
Biological assessment of robust noise models in microarray data analysis.生物评估稳健噪声模型在微阵列数据分析中的应用。
Bioinformatics. 2011 Mar 15;27(6):807-14. doi: 10.1093/bioinformatics/btr018. Epub 2011 Jan 19.
10
S100A8/A9 activate key genes and pathways in colon tumor progression.S100A8/A9 激活结直肠癌进展中的关键基因和通路。
Mol Cancer Res. 2011 Feb;9(2):133-48. doi: 10.1158/1541-7786.MCR-10-0394. Epub 2011 Jan 12.