• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于多元高斯混合的微阵列数据监督聚类分析。

Supervised cluster analysis for microarray data based on multivariate Gaussian mixture.

作者信息

Qu Yi, Xu Shizhong

机构信息

Department of Botany and Plant Sciences, University of California, Riverside, CA 92521, USA.

出版信息

Bioinformatics. 2004 Aug 12;20(12):1905-13. doi: 10.1093/bioinformatics/bth177. Epub 2004 Mar 25.

DOI:10.1093/bioinformatics/bth177
PMID:15044244
Abstract

MOTIVATION

Grouping genes having similar expression patterns is called gene clustering, which has been proved to be a useful tool for extracting underlying biological information of gene expression data. Many clustering procedures have shown success in microarray gene clustering; most of them belong to the family of heuristic clustering algorithms. Model-based algorithms are alternative clustering algorithms, which are based on the assumption that the whole set of microarray data is a finite mixture of a certain type of distributions with different parameters. Application of the model-based algorithms to unsupervised clustering has been reported. Here, for the first time, we demonstrated the use of the model-based algorithm in supervised clustering of microarray data.

RESULTS

We applied the proposed methods to real gene expression data and simulated data. We showed that the supervised model-based algorithm is superior over the unsupervised method and the support vector machines (SVM) method.

AVAILABILITY

The program written in the SAS language implementing methods I-III in this report is available upon request. The software of SVMs is available in the website http://svm.sdsc.edu/cgi-bin/nph-SVMsubmit.cgi

摘要

动机

将具有相似表达模式的基因进行分组称为基因聚类,事实证明这是提取基因表达数据潜在生物学信息的有用工具。许多聚类程序在微阵列基因聚类中已取得成功;其中大多数属于启发式聚类算法家族。基于模型的算法是另一类聚类算法,其基于这样的假设:微阵列数据的整个集合是具有不同参数的某种分布的有限混合。已有将基于模型的算法应用于无监督聚类的报道。在此,我们首次展示了基于模型的算法在微阵列数据监督聚类中的应用。

结果

我们将所提出的方法应用于真实基因表达数据和模拟数据。我们表明,基于监督模型的算法优于无监督方法和支持向量机(SVM)方法。

可用性

应要求可提供用SAS语言编写的实现本报告中方法I - III的程序。支持向量机软件可在网站http://svm.sdsc.edu/cgi-bin/nph-SVMsubmit.cgi获取

相似文献

1
Supervised cluster analysis for microarray data based on multivariate Gaussian mixture.基于多元高斯混合的微阵列数据监督聚类分析。
Bioinformatics. 2004 Aug 12;20(12):1905-13. doi: 10.1093/bioinformatics/bth177. Epub 2004 Mar 25.
2
Bayesian mixture model based clustering of replicated microarray data.基于贝叶斯混合模型的重复微阵列数据聚类
Bioinformatics. 2004 May 22;20(8):1222-32. doi: 10.1093/bioinformatics/bth068. Epub 2004 Feb 10.
3
Comparisons and validation of statistical clustering techniques for microarray gene expression data.微阵列基因表达数据统计聚类技术的比较与验证
Bioinformatics. 2003 Mar 1;19(4):459-66. doi: 10.1093/bioinformatics/btg025.
4
Interactively optimizing signal-to-noise ratios in expression profiling: project-specific algorithm selection and detection p-value weighting in Affymetrix microarrays.在表达谱分析中交互式优化信噪比:Affymetrix微阵列中特定项目的算法选择和检测p值加权
Bioinformatics. 2004 Nov 1;20(16):2534-44. doi: 10.1093/bioinformatics/bth280. Epub 2004 Apr 29.
5
Clustering of change patterns using Fourier coefficients.使用傅里叶系数对变化模式进行聚类。
Bioinformatics. 2008 Jan 15;24(2):184-91. doi: 10.1093/bioinformatics/btm568. Epub 2007 Nov 19.
6
Modeling and visualizing uncertainty in gene expression clusters using dirichlet process mixtures.使用狄利克雷过程混合模型对基因表达聚类中的不确定性进行建模和可视化。
IEEE/ACM Trans Comput Biol Bioinform. 2009 Oct-Dec;6(4):615-28. doi: 10.1109/TCBB.2007.70269.
7
A new clustering method for microarray data analysis.一种用于微阵列数据分析的新聚类方法。
Proc IEEE Comput Soc Bioinform Conf. 2002;1:268-75.
8
CLICK and EXPANDER: a system for clustering and visualizing gene expression data.CLICK和EXPANDER:一种用于基因表达数据聚类和可视化的系统。
Bioinformatics. 2003 Sep 22;19(14):1787-99. doi: 10.1093/bioinformatics/btg232.
9
Kernel hierarchical gene clustering from microarray expression data.基于微阵列表达数据的核层次基因聚类
Bioinformatics. 2003 Nov 1;19(16):2097-104. doi: 10.1093/bioinformatics/btg288.
10
Bayesian infinite mixture model based clustering of gene expression profiles.基于贝叶斯无限混合模型的基因表达谱聚类
Bioinformatics. 2002 Sep;18(9):1194-206. doi: 10.1093/bioinformatics/18.9.1194.

引用本文的文献

1
Assessment of data transformations for model-based clustering of RNA-Seq data.基于模型的RNA测序数据聚类的数据转换评估
PLoS One. 2018 Feb 27;13(2):e0191758. doi: 10.1371/journal.pone.0191758. eCollection 2018.
2
An improved method for functional similarity analysis of genes based on Gene Ontology.一种基于基因本体论的基因功能相似性分析的改进方法。
BMC Syst Biol. 2016 Dec 23;10(Suppl 4):119. doi: 10.1186/s12918-016-0359-z.
3
SGFSC: speeding the gene functional similarity calculation based on hash tables.SGFSC:基于哈希表加速基因功能相似性计算
BMC Bioinformatics. 2016 Nov 4;17(1):445. doi: 10.1186/s12859-016-1294-0.
4
A Method for the Annotation of Functional Similarities of Coding DNA Sequences: the Case of a Populated Cluster of Transmembrane Proteins.一种编码DNA序列功能相似性的注释方法:以跨膜蛋白的一个聚集簇为例
J Mol Evol. 2017 Jan;84(1):29-38. doi: 10.1007/s00239-016-9763-7. Epub 2016 Nov 3.
5
Performance Evaluation of Missing-Value Imputation Clustering Based on a Multivariate Gaussian Mixture Model.基于多元高斯混合模型的缺失值插补聚类性能评估
PLoS One. 2016 Aug 23;11(8):e0161112. doi: 10.1371/journal.pone.0161112. eCollection 2016.
6
Semi-supervised clustering methods.半监督聚类方法。
Wiley Interdiscip Rev Comput Stat. 2013;5(5):349-361. doi: 10.1002/wics.1270.
7
Comparative analysis of missing value imputation methods to improve clustering and interpretation of microarray experiments.比较缺失值插补方法以提高微阵列实验的聚类和解释。
BMC Genomics. 2010 Jan 7;11:15. doi: 10.1186/1471-2164-11-15.
8
FunSimMat update: new features for exploring functional similarity.FunSimMat 更新:探索功能相似性的新功能。
Nucleic Acids Res. 2010 Jan;38(Database issue):D244-8. doi: 10.1093/nar/gkp979. Epub 2009 Nov 18.
9
caBIG VISDA: modeling, visualization, and discovery for cluster analysis of genomic data.caBIG VISDA:用于基因组数据聚类分析的建模、可视化与发现
BMC Bioinformatics. 2008 Sep 18;9:383. doi: 10.1186/1471-2105-9-383.
10
FunSimMat: a comprehensive functional similarity database.FunSimMat:一个全面的功能相似性数据库。
Nucleic Acids Res. 2008 Jan;36(Database issue):D434-9. doi: 10.1093/nar/gkm806. Epub 2007 Oct 11.