• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于形状相似性的基因表达数据聚类

Clustering of gene expression data based on shape similarity.

作者信息

Hestilow Travis J, Huang Yufei

机构信息

Department of Electrical and Computer Engineering, The University of Texas at San Antonio, San Antonio, TX 78249, USA.

出版信息

EURASIP J Bioinform Syst Biol. 2009;2009(1):195712. doi: 10.1155/2009/195712. Epub 2009 Apr 23.

DOI:10.1155/2009/195712
PMID:19404484
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3171421/
Abstract

A method for gene clustering from expression profiles using shape information is presented. The conventional clustering approaches such as K-means assume that genes with similar functions have similar expression levels and hence allocate genes with similar expression levels into the same cluster. However, genes with similar function often exhibit similarity in signal shape even though the expression magnitude can be far apart. Therefore, this investigation studies clustering according to signal shape similarity. This shape information is captured in the form of normalized and time-scaled forward first differences, which then are subject to a variational Bayes clustering plus a non-Bayesian (Silhouette) cluster statistic. The statistic shows an improved ability to identify the correct number of clusters and assign the components of cluster. Based on initial results for both generated test data and Escherichia coli microarray expression data and initial validation of the Escherichia coli results, it is shown that the method has promise in being able to better cluster time-series microarray data according to shape similarity.

摘要

提出了一种利用形状信息从表达谱中进行基因聚类的方法。传统的聚类方法,如K均值法,假定具有相似功能的基因具有相似的表达水平,因此将具有相似表达水平的基因分配到同一聚类中。然而,具有相似功能的基因即使表达量可能相差很大,其信号形状通常也会表现出相似性。因此,本研究根据信号形状相似性进行聚类。这种形状信息以归一化和时间缩放的前向一阶差分的形式捕获,然后对其进行变分贝叶斯聚类以及非贝叶斯(轮廓)聚类统计。该统计显示出在识别正确聚类数量和分配聚类成分方面有更强的能力。基于生成的测试数据和大肠杆菌微阵列表达数据的初步结果以及对大肠杆菌结果的初步验证,表明该方法有望能够根据形状相似性更好地对时间序列微阵列数据进行聚类。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c0b/3171421/510e7d13465f/1687-4153-2009-195712-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c0b/3171421/134bf3682bf2/1687-4153-2009-195712-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c0b/3171421/93e6a7f479bf/1687-4153-2009-195712-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c0b/3171421/5c6d0a278851/1687-4153-2009-195712-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c0b/3171421/30c998b0f4b2/1687-4153-2009-195712-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c0b/3171421/6e07ce8d2108/1687-4153-2009-195712-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c0b/3171421/5c24ce92aebd/1687-4153-2009-195712-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c0b/3171421/6461d90659c5/1687-4153-2009-195712-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c0b/3171421/d6653233a582/1687-4153-2009-195712-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c0b/3171421/510e7d13465f/1687-4153-2009-195712-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c0b/3171421/134bf3682bf2/1687-4153-2009-195712-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c0b/3171421/93e6a7f479bf/1687-4153-2009-195712-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c0b/3171421/5c6d0a278851/1687-4153-2009-195712-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c0b/3171421/30c998b0f4b2/1687-4153-2009-195712-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c0b/3171421/6e07ce8d2108/1687-4153-2009-195712-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c0b/3171421/5c24ce92aebd/1687-4153-2009-195712-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c0b/3171421/6461d90659c5/1687-4153-2009-195712-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c0b/3171421/d6653233a582/1687-4153-2009-195712-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c0b/3171421/510e7d13465f/1687-4153-2009-195712-9.jpg

相似文献

1
Clustering of gene expression data based on shape similarity.基于形状相似性的基因表达数据聚类
EURASIP J Bioinform Syst Biol. 2009;2009(1):195712. doi: 10.1155/2009/195712. Epub 2009 Apr 23.
2
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
3
A cluster merging method for time series microarray with production values.一种针对具有生产值的时间序列微阵列的聚类合并方法。
Int J Neural Syst. 2014 Sep;24(6):1450018. doi: 10.1142/S012906571450018X. Epub 2014 Jul 24.
4
Gene microarray analysis using angular distribution decomposition.使用角分布分解的基因微阵列分析。
J Comput Biol. 2007 Jan-Feb;14(1):68-83. doi: 10.1089/cmb.2006.0098.
5
A novel approach for clustering proteomics data using Bayesian fast Fourier transform.一种使用贝叶斯快速傅里叶变换对蛋白质组学数据进行聚类的新方法。
Bioinformatics. 2005 May 15;21(10):2210-24. doi: 10.1093/bioinformatics/bti383. Epub 2005 Mar 15.
6
Metric for measuring the effectiveness of clustering of DNA microarray expression.用于测量 DNA 微阵列表达聚类有效性的度量。
BMC Bioinformatics. 2006 Sep 6;7 Suppl 2(Suppl 2):S5. doi: 10.1186/1471-2105-7-S2-S5.
7
Clustering of gene expression data using a local shape-based similarity measure.使用基于局部形状的相似性度量对基因表达数据进行聚类。
Bioinformatics. 2005 Apr 1;21(7):1069-77. doi: 10.1093/bioinformatics/bti095. Epub 2004 Oct 28.
8
Subtyping of children with developmental dyslexia via bootstrap aggregated clustering and the gap statistic: comparison with the double-deficit hypothesis.通过自助聚合聚类和间隙统计对发育性阅读障碍儿童进行亚型分类:与双重缺陷假说的比较
Int J Lang Commun Disord. 2007 Jan-Feb;42(1):77-95. doi: 10.1080/13682820600806680.
9
Detecting clusters of different geometrical shapes in microarray gene expression data.在微阵列基因表达数据中检测不同几何形状的聚类。
Bioinformatics. 2005 May 1;21(9):1927-34. doi: 10.1093/bioinformatics/bti251. Epub 2005 Jan 12.
10
FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data.FLAME,一种用于分析DNA微阵列数据的新型模糊聚类方法。
BMC Bioinformatics. 2007 Jan 4;8:3. doi: 10.1186/1471-2105-8-3.

引用本文的文献

1
Phenotype similarities in automatically grouped T2D patients by variation-based clustering of IL-1β gene expression.通过基于变异的白细胞介素-1β基因表达聚类对自动分组的2型糖尿病患者进行表型相似性分析。
EJIFCC. 2023 Oct 16;34(3):228-244. eCollection 2023 Oct.
2
Introducing the novel Cytoscape app TimeNexus to analyze time-series data using temporal MultiLayer Networks (tMLNs).介绍新的 Cytoscape 应用程序 TimeNexus,用于使用时间多层网络(tMLNs)分析时间序列数据。
Sci Rep. 2021 Jul 1;11(1):13691. doi: 10.1038/s41598-021-93128-5.
3
On the selection of appropriate distances for gene expression data clustering.

本文引用的文献

1
Modeling and visualizing uncertainty in gene expression clusters using dirichlet process mixtures.使用狄利克雷过程混合模型对基因表达聚类中的不确定性进行建模和可视化。
IEEE/ACM Trans Comput Biol Bioinform. 2009 Oct-Dec;6(4):615-28. doi: 10.1109/TCBB.2007.70269.
2
An unsupervised conditional random fields approach for clustering gene expression time series.一种用于对基因表达时间序列进行聚类的无监督条件随机场方法。
Bioinformatics. 2008 Nov 1;24(21):2467-73. doi: 10.1093/bioinformatics/btn375. Epub 2008 Aug 20.
3
Determining the number of clusters using the weighted gap statistic.
基因表达数据聚类中适当距离的选择。
BMC Bioinformatics. 2014;15 Suppl 2(Suppl 2):S2. doi: 10.1186/1471-2105-15-S2-S2. Epub 2014 Jan 24.
4
Frequency-based time-series gene expression recomposition using PRIISM.使用PRIISM基于频率的时间序列基因表达重组
BMC Syst Biol. 2012 Jun 15;6:69. doi: 10.1186/1752-0509-6-69.
5
A topology-based metric for measuring term similarity in the gene ontology.一种用于衡量基因本体中术语相似性的基于拓扑结构的度量方法。
Adv Bioinformatics. 2012;2012:975783. doi: 10.1155/2012/975783. Epub 2012 May 15.
6
Nonlinear gene cluster analysis with labeling for microarray gene expression data in organ development.用于器官发育中微阵列基因表达数据标记的非线性基因簇分析
BMC Proc. 2011 May 28;5 Suppl 2(Suppl 2):S3. doi: 10.1186/1753-6561-5-S2-S3.
7
Measuring similarity between gene expression profiles: a Bayesian approach.测量基因表达谱之间的相似性:贝叶斯方法。
BMC Genomics. 2009 Dec 3;10 Suppl 3(Suppl 3):S14. doi: 10.1186/1471-2164-10-S3-S14.
使用加权间隙统计量确定聚类的数量。
Biometrics. 2007 Dec;63(4):1031-7. doi: 10.1111/j.1541-0420.2007.00784.x. Epub 2007 Apr 9.
4
Measuring similarities between gene expression profiles through new data transformations.通过新的数据转换方法测量基因表达谱之间的相似性。
BMC Bioinformatics. 2007 Jan 27;8:29. doi: 10.1186/1471-2105-8-29.
5
A data-driven clustering method for time course gene expression data.一种用于时间序列基因表达数据的数据驱动聚类方法。
Nucleic Acids Res. 2006 Mar 1;34(4):1261-9. doi: 10.1093/nar/gkl013. Print 2006.
6
An approach for clustering gene expression data with error information.一种用于对带有误差信息的基因表达数据进行聚类的方法。
BMC Bioinformatics. 2006 Jan 12;7:17. doi: 10.1186/1471-2105-7-17.
7
BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks.BiNGO:一款用于评估基因本体类别在生物网络中过度代表性的Cytoscape插件。
Bioinformatics. 2005 Aug 15;21(16):3448-9. doi: 10.1093/bioinformatics/bti551. Epub 2005 Jun 21.
8
Clustering of gene expression data using a local shape-based similarity measure.使用基于局部形状的相似性度量对基因表达数据进行聚类。
Bioinformatics. 2005 Apr 1;21(7):1069-77. doi: 10.1093/bioinformatics/bti095. Epub 2004 Oct 28.
9
Cytoscape: a software environment for integrated models of biomolecular interaction networks.Cytoscape:用于生物分子相互作用网络集成模型的软件环境。
Genome Res. 2003 Nov;13(11):2498-504. doi: 10.1101/gr.1239303.
10
Continuous representations of time-series gene expression data.时间序列基因表达数据的连续表示。
J Comput Biol. 2003;10(3-4):341-56. doi: 10.1089/10665270360688057.