• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于基因表达谱聚类和插补的多阶段方法。

A multi-stage approach to clustering and imputation of gene expression profiles.

作者信息

Wong Dorothy S V, Wong Frederick K, Wood Graham R

机构信息

Department of Statistics, Macquarie University, NSW 2109, Australia.

出版信息

Bioinformatics. 2007 Apr 15;23(8):998-1005. doi: 10.1093/bioinformatics/btm053. Epub 2007 Feb 18.

DOI:10.1093/bioinformatics/btm053
PMID:17308340
Abstract

MOTIVATION

Microarray experiments have revolutionized the study of gene expression with their ability to generate large amounts of data. This article describes an alternative to existing approaches to clustering of gene expression profiles; the key idea is to cluster in stages using a hierarchy of distance measures. This method is motivated by the way in which the human mind sorts and so groups many items. The distance measures arise from the orthogonal breakup of Euclidean distance, giving us a set of independent measures of different attributes of the gene expression profile. Interpretation of these distances is closely related to the statistical design of the microarray experiment. This clustering method not only accommodates missing data but also leads to an associated imputation method.

RESULTS

The performance of the clustering and imputation methods was tested on a simulated dataset, a yeast cell cycle dataset and a central nervous system development dataset. Based on the Rand and adjusted Rand indices, the clustering method is more consistent with the biological classification of the data than commonly used clustering methods. The imputation method, at varying levels of missingness, outperforms most imputation methods, based on root mean squared error (RMSE).

AVAILABILITY

Code in R is available on request from the authors.

摘要

动机

微阵列实验凭借其生成大量数据的能力,彻底改变了基因表达的研究方式。本文介绍了一种不同于现有基因表达谱聚类方法的替代方法;其关键思想是使用距离度量层次进行分阶段聚类。该方法的灵感来源于人类思维对众多项目进行分类和分组的方式。距离度量源自欧几里得距离的正交分解,为我们提供了一组关于基因表达谱不同属性的独立度量。这些距离的解释与微阵列实验的统计设计密切相关。这种聚类方法不仅能够处理缺失数据,还引出了一种相关的插补方法。

结果

在一个模拟数据集、一个酵母细胞周期数据集和一个中枢神经系统发育数据集上对聚类和插补方法的性能进行了测试。基于兰德指数和调整后的兰德指数,该聚类方法比常用聚类方法在数据的生物学分类上更为一致。基于均方根误差(RMSE),在不同缺失程度下,插补方法优于大多数插补方法。

可用性

可向作者索取R语言代码。

相似文献

1
A multi-stage approach to clustering and imputation of gene expression profiles.一种用于基因表达谱聚类和插补的多阶段方法。
Bioinformatics. 2007 Apr 15;23(8):998-1005. doi: 10.1093/bioinformatics/btm053. Epub 2007 Feb 18.
2
Towards clustering of incomplete microarray data without the use of imputation.迈向无需插补的不完整微阵列数据聚类
Bioinformatics. 2007 Jan 1;23(1):107-13. doi: 10.1093/bioinformatics/btl555. Epub 2006 Oct 31.
3
DNA microarray data imputation and significance analysis of differential expression.DNA微阵列数据插补与差异表达的显著性分析
Bioinformatics. 2005 Nov 15;21(22):4155-61. doi: 10.1093/bioinformatics/bti638. Epub 2005 Aug 23.
4
Detecting clusters of different geometrical shapes in microarray gene expression data.在微阵列基因表达数据中检测不同几何形状的聚类。
Bioinformatics. 2005 May 1;21(9):1927-34. doi: 10.1093/bioinformatics/bti251. Epub 2005 Jan 12.
5
Graph-based consensus clustering for class discovery from gene expression data.基于图的共识聚类用于从基因表达数据中发现类别
Bioinformatics. 2007 Nov 1;23(21):2888-96. doi: 10.1093/bioinformatics/btm463. Epub 2007 Sep 14.
6
Clustering of change patterns using Fourier coefficients.使用傅里叶系数对变化模式进行聚类。
Bioinformatics. 2008 Jan 15;24(2):184-91. doi: 10.1093/bioinformatics/btm568. Epub 2007 Nov 19.
7
A mixture model with random-effects components for clustering correlated gene-expression profiles.一种具有随机效应成分的混合模型,用于对相关基因表达谱进行聚类。
Bioinformatics. 2006 Jul 15;22(14):1745-52. doi: 10.1093/bioinformatics/btl165. Epub 2006 May 3.
8
Weighted rank aggregation of cluster validation measures: a Monte Carlo cross-entropy approach.聚类验证指标的加权排序聚合:一种蒙特卡洛交叉熵方法。
Bioinformatics. 2007 Jul 1;23(13):1607-15. doi: 10.1093/bioinformatics/btm158. Epub 2007 May 5.
9
Classification based upon gene expression data: bias and precision of error rates.基于基因表达数据的分类:错误率的偏差与精度
Bioinformatics. 2007 Jun 1;23(11):1363-70. doi: 10.1093/bioinformatics/btm117. Epub 2007 Mar 28.
10
How does gene expression clustering work?基因表达聚类是如何工作的?
Nat Biotechnol. 2005 Dec;23(12):1499-501. doi: 10.1038/nbt1205-1499.

引用本文的文献

1
Recent Advancements in Subcellular Proteomics: Growing Impact of Organellar Protein Niches on the Understanding of Cell Biology.亚细胞蛋白质组学的最新进展:细胞器蛋白质龛对细胞生物学理解的影响日益增大。
J Proteome Res. 2024 Aug 2;23(8):2700-2722. doi: 10.1021/acs.jproteome.3c00839. Epub 2024 Mar 7.
2
Clustering of time-course gene expression profiles using normal mixture models with autoregressive random effects.使用具有自回归随机效应的正态混合模型对时间序列基因表达谱进行聚类。
BMC Bioinformatics. 2012 Nov 14;13:300. doi: 10.1186/1471-2105-13-300.
3
Incorporating Nonlinear Relationships in Microarray Missing Value Imputation.
在微阵列缺失值插补中纳入非线性关系。
IEEE/ACM Trans Comput Biol Bioinform. 2011 May-Jun;8(3):723-31. doi: 10.1109/TCBB.2010.73.
4
Comparative analysis of missing value imputation methods to improve clustering and interpretation of microarray experiments.比较缺失值插补方法以提高微阵列实验的聚类和解释。
BMC Genomics. 2010 Jan 7;11:15. doi: 10.1186/1471-2164-11-15.
5
How to improve postgenomic knowledge discovery using imputation.如何利用插补法改善后基因组知识发现。
EURASIP J Bioinform Syst Biol. 2009;2009(1):717136. doi: 10.1155/2009/717136. Epub 2009 Feb 8.