Suppr超能文献

基于惩罚似然的混合效应模型混合的时间基因表达数据聚类。

Clustering of temporal gene expression data with mixtures of mixed effects models with a penalized likelihood.

机构信息

Biostatistics, Boston University, Boston, MA, USA.

Orthopaedic Surgery, Boston University, Boston, MA, USA.

出版信息

Bioinformatics. 2019 Mar 1;35(5):778-786. doi: 10.1093/bioinformatics/bty696.

Abstract

MOTIVATION

Clustering algorithms like K-Means and standard Gaussian mixture models (GMM) fail to account for the structure of variability of replicated data or repeated measures over time. Additionally, a priori cluster number assumptions add an additional complexity to the process. Current methods to optimize cluster labels and number can be inaccurate or computationally intensive for temporal gene expression data with this additional variability.

RESULTS

An extension to a model-based clustering algorithm is proposed using mixtures of mixed effects polynomial regression models and the EM algorithm with an entropy penalized log-likelihood function (EPEM). The EPEM is used to cluster temporal gene expression data with this additional variability. The addition of random effects in our model decreased the misclassification error when compared to mixtures of fixed effects models or other methods such as K-Means and GMM. Applying our method to microarray data from a fracture healing study revealed distinct temporal patterns of gene expression.

AVAILABILITY AND IMPLEMENTATION

https://github.com/darlenelu72/EPEM-GMM.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

聚类算法,如 K-Means 和标准高斯混合模型(GMM),无法考虑重复数据或随时间重复测量的可变性结构。此外,先验聚类数假设为该过程增加了额外的复杂性。对于具有这种额外可变性的时间基因表达数据,当前优化聚类标签和数量的方法可能不准确或计算强度大。

结果

提出了一种基于模型的聚类算法扩展,该算法使用混合混合效应多项式回归模型和具有熵惩罚对数似然函数(EPEM)的 EM 算法。EPEM 用于对具有这种额外可变性的时间基因表达数据进行聚类。与固定效应模型的混合物或 K-Means 和 GMM 等其他方法相比,我们模型中的随机效应的添加降低了分类错误率。将我们的方法应用于骨折愈合研究的微阵列数据揭示了基因表达的明显时间模式。

可用性和实现

https://github.com/darlenelu72/EPEM-GMM。

补充信息

补充数据可在 Bioinformatics 在线获得。

相似文献

8
A GMM-IG framework for selecting genes as expression panel biomarkers.一种用于选择基因作为表达谱生物标志物的 GMM-IG 框架。
Artif Intell Med. 2010 Feb-Mar;48(2-3):75-82. doi: 10.1016/j.artmed.2009.07.006. Epub 2009 Dec 8.

本文引用的文献

6
Significance analysis of time course microarray experiments.时间进程微阵列实验的显著性分析
Proc Natl Acad Sci U S A. 2005 Sep 6;102(36):12837-42. doi: 10.1073/pnas.0504609102. Epub 2005 Sep 2.
7
Recursive unsupervised learning of finite mixture models.有限混合模型的递归无监督学习
IEEE Trans Pattern Anal Mach Intell. 2004 May;26(5):651-6. doi: 10.1109/TPAMI.2004.1273970.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验