• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

具有病理特征的基因表达谱的受限聚类

Constrained clusters of gene expression profiles with pathological features.

作者信息

Sese Jun, Kurokawa Yukinori, Monden Morito, Kato Kikuya, Morishita Shinichi

机构信息

Undergraduate Program for Bioinformatics and Systems Biology, Graduate School of Frontier Sciences, University of Tokyo, Bunkyo, Tokyo, Japan.

出版信息

Bioinformatics. 2004 Nov 22;20(17):3137-45. doi: 10.1093/bioinformatics/bth373. Epub 2004 Jun 24.

DOI:10.1093/bioinformatics/bth373
PMID:15217814
Abstract

MOTIVATION

Gene expression profiles should be useful in distinguishing variations in disease, since they reflect accurately the status of cells. The primary clustering of gene expression reveals the genotypes that are responsible for the proximity of members within each cluster, while further clustering elucidates the pathological features of the individual members of each cluster. However, since the first clustering process and the second classification step, in which the features are associated with clusters, are performed independently, the initial set of clusters may omit genes that are associated with pathologically meaningful features. Therefore, it is important to devise a way of identifying gene expression clusters that are associated with pathological features.

RESULTS

We present the novel technique of 'itemset constrained clustering' (IC-Clustering), which computes the optimal cluster that maximizes the interclass variance of gene expression between groups, which are divided according to the restriction that only divisions that can be expressed using common features are allowed. This constraint automatically labels each cluster with a set of pathological features which characterize that cluster. When applied to liver cancer datasets, IC-Clustering revealed informative gene expression clusters, which could be annotated with various pathological features, such as 'tumor' and 'man', or 'except tumor' and 'normal liver function'. In contrast, the k-means method overlooked these clusters.

摘要

动机

基因表达谱应有助于区分疾病中的变异,因为它们能准确反映细胞状态。基因表达的初始聚类揭示了导致每个聚类中成员接近的基因型,而进一步聚类则阐明了每个聚类中各个成员的病理特征。然而,由于第一步聚类过程和第二步将特征与聚类相关联的分类步骤是独立进行的,初始聚类集可能会遗漏与具有病理意义特征相关的基因。因此,设计一种识别与病理特征相关的基因表达聚类的方法很重要。

结果

我们提出了“项集约束聚类”(IC-聚类)这一新技术,它计算最优聚类,该聚类能使根据仅允许使用共同特征进行划分这一限制而划分的组之间基因表达的类间方差最大化。这种约束会自动用一组表征该聚类的病理特征为每个聚类标注。当应用于肝癌数据集时,IC-聚类揭示了信息丰富的基因表达聚类,这些聚类可用各种病理特征进行注释,如“肿瘤”和“男性”,或“非肿瘤”和“肝功能正常”。相比之下,k均值方法忽略了这些聚类。

相似文献

1
Constrained clusters of gene expression profiles with pathological features.具有病理特征的基因表达谱的受限聚类
Bioinformatics. 2004 Nov 22;20(17):3137-45. doi: 10.1093/bioinformatics/bth373. Epub 2004 Jun 24.
2
Simultaneous classification and feature clustering using discriminant vector quantization with applications to microarray data analysis.使用判别向量量化的同时分类与特征聚类及其在微阵列数据分析中的应用
Proc IEEE Comput Soc Bioinform Conf. 2002;1:246-55.
3
Simple decision rules for classifying human cancers from gene expression profiles.基于基因表达谱对人类癌症进行分类的简单决策规则。
Bioinformatics. 2005 Oct 15;21(20):3896-904. doi: 10.1093/bioinformatics/bti631. Epub 2005 Aug 16.
4
Simultaneous gene clustering and subset selection for sample classification via MDL.通过最小描述长度实现用于样本分类的同步基因聚类和子集选择
Bioinformatics. 2003 Jun 12;19(9):1100-9. doi: 10.1093/bioinformatics/btg039.
5
Cancer molecular pattern discovery by subspace consensus kernel classification.基于子空间共识核分类的癌症分子模式发现
Comput Syst Bioinformatics Conf. 2007;6:55-65.
6
A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis.用于微阵列基因表达癌症诊断的多类别分类方法的综合评估。
Bioinformatics. 2005 Mar 1;21(5):631-43. doi: 10.1093/bioinformatics/bti033. Epub 2004 Sep 16.
7
A novel approach for clustering proteomics data using Bayesian fast Fourier transform.一种使用贝叶斯快速傅里叶变换对蛋白质组学数据进行聚类的新方法。
Bioinformatics. 2005 May 15;21(10):2210-24. doi: 10.1093/bioinformatics/bti383. Epub 2005 Mar 15.
8
Knowledge-assisted recognition of cluster boundaries in gene expression data.基因表达数据中聚类边界的知识辅助识别。
Artif Intell Med. 2005 Sep-Oct;35(1-2):171-83. doi: 10.1016/j.artmed.2005.02.007.
9
Clustering and re-clustering for pattern discovery in gene expression data.用于基因表达数据中模式发现的聚类和再聚类。
J Bioinform Comput Biol. 2005 Apr;3(2):281-301. doi: 10.1142/s0219720005001053.
10
Clustering of gene expression data: performance and similarity analysis.基因表达数据的聚类:性能与相似性分析
BMC Bioinformatics. 2006 Dec 12;7 Suppl 4(Suppl 4):S19. doi: 10.1186/1471-2105-7-S4-S19.

引用本文的文献

1
Genotype matrix mapping: searching for quantitative trait loci interactions in genetic variation in complex traits.基因型矩阵映射:在复杂性状的遗传变异中寻找数量性状基因座相互作用
DNA Res. 2007 Oct 31;14(5):217-25. doi: 10.1093/dnares/dsm020. Epub 2007 Nov 13.
2
Simultaneous clustering of gene expression data with clinical chemistry and pathological evaluations reveals phenotypic prototypes.基因表达数据与临床化学和病理评估的同时聚类揭示了表型原型。
BMC Syst Biol. 2007 Feb 23;1:15. doi: 10.1186/1752-0509-1-15.
3
Systematic interpretation of microarray data using experiment annotations.
BMC Genomics. 2006 Dec 20;7:319. doi: 10.1186/1471-2164-7-319.
4
High-dimensional and large-scale phenotyping of yeast mutants.酵母突变体的高维大规模表型分析
Proc Natl Acad Sci U S A. 2005 Dec 27;102(52):19015-20. doi: 10.1073/pnas.0509436102. Epub 2005 Dec 19.