Suppr超能文献

基于自适应结构收缩的贝叶斯广义双聚类分析。

Bayesian generalized biclustering analysis via adaptive structured shrinkage.

机构信息

Department of Biostatistics and Bioinformatics, Emory University, 1518 Clifton Road, NE, Atlanta, GA, USA.

Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, 423 Guardian Drive, Philadelphia, PA, USA.

出版信息

Biostatistics. 2020 Jul 1;21(3):610-624. doi: 10.1093/biostatistics/kxy081.

Abstract

Biclustering techniques can identify local patterns of a data matrix by clustering feature space and sample space at the same time. Various biclustering methods have been proposed and successfully applied to analysis of gene expression data. While existing biclustering methods have many desirable features, most of them are developed for continuous data and few of them can efficiently handle -omics data of various types, for example, binomial data as in single nucleotide polymorphism data or negative binomial data as in RNA-seq data. In addition, none of existing methods can utilize biological information such as those from functional genomics or proteomics. Recent work has shown that incorporating biological information can improve variable selection and prediction performance in analyses such as linear regression and multivariate analysis. In this article, we propose a novel Bayesian biclustering method that can handle multiple data types including Gaussian, Binomial, and Negative Binomial. In addition, our method uses a Bayesian adaptive structured shrinkage prior that enables feature selection guided by existing biological information. Our simulation studies and application to multi-omics datasets demonstrate robust and superior performance of the proposed method, compared to other existing biclustering methods.

摘要

双聚类技术可以通过同时对特征空间和样本空间进行聚类来识别数据矩阵的局部模式。已经提出了各种双聚类方法,并成功地应用于基因表达数据的分析。虽然现有的双聚类方法具有许多理想的特征,但它们大多是为连续数据开发的,很少有方法能够有效地处理各种类型的组学数据,例如单核苷酸多态性数据中的二项式数据或 RNA-seq 数据中的负二项式数据。此外,现有的方法都不能利用功能基因组学或蛋白质组学等生物学信息。最近的研究表明,在线性回归和多元分析等分析中,结合生物学信息可以提高变量选择和预测性能。在本文中,我们提出了一种新的贝叶斯双聚类方法,该方法可以处理包括高斯、二项式和负二项式在内的多种数据类型。此外,我们的方法使用了贝叶斯自适应结构化收缩先验,能够根据现有生物学信息进行特征选择。与其他现有的双聚类方法相比,我们的模拟研究和对多组学数据集的应用表明,所提出的方法具有稳健和优越的性能。

相似文献

3
Knowledge-Guided Biclustering via Sparse Variational EM Algorithm.基于稀疏变分期望最大化算法的知识引导双聚类
10th IEEE Int Conf Big Knowl (2019). 2019 Nov;2019:25-32. doi: 10.1109/icbk.2019.00012. Epub 2019 Dec 30.
6
Bayesian biclustering of gene expression data.基因表达数据的贝叶斯双聚类分析
BMC Genomics. 2008;9 Suppl 1(Suppl 1):S4. doi: 10.1186/1471-2164-9-S1-S4.

引用本文的文献

2
A clustering approach to integrative analyses of multiomic cancer data.一种用于多组学癌症数据综合分析的聚类方法。
J Appl Stat. 2024 Nov 29;52(8):1539-1560. doi: 10.1080/02664763.2024.2431742. eCollection 2025.
4
Knowledge-guided learning methods for integrative analysis of multi-omics data.用于多组学数据综合分析的知识引导学习方法。
Comput Struct Biotechnol J. 2024 Apr 30;23:1945-1950. doi: 10.1016/j.csbj.2024.04.053. eCollection 2024 Dec.
8
Robust integrative biclustering for multi-view data.多视图数据的稳健集成双聚类。
Stat Methods Med Res. 2022 Nov;31(11):2201-2216. doi: 10.1177/09622802221122427. Epub 2022 Sep 13.

本文引用的文献

6
A systematic comparative evaluation of biclustering techniques.双聚类技术的系统比较评估
BMC Bioinformatics. 2017 Jan 23;18(1):55. doi: 10.1186/s12859-017-1487-1.
8
Biclustering on expression data: A review.基于表达数据的双聚类分析:综述
J Biomed Inform. 2015 Oct;57:163-80. doi: 10.1016/j.jbi.2015.06.028. Epub 2015 Jul 6.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验