Suppr超能文献

从基因表达数据中高效挖掘有判别力的共聚类

Efficient Mining of Discriminative Co-clusters from Gene Expression Data.

作者信息

Odibat Omar, Reddy Chandan K

机构信息

Department of Computer Science, Wayne State University, Detroit, MI, 48202.

出版信息

Knowl Inf Syst. 2014 Dec;41(3):667-696. doi: 10.1007/s10115-013-0684-0.

Abstract

Discriminative models are used to analyze the differences between two classes and to identify class-specific patterns. Most of the existing discriminative models depend on using the entire feature space to compute the discriminative patterns for each class. Co-clustering has been proposed to capture the patterns that are correlated in a subset of features, but it cannot handle discriminative patterns in labeled datasets. In certain biological applications such as gene expression analysis, it is critical to consider the discriminative patterns that are correlated only in a subset of the feature space. The objective of this paper is two-fold: first, it presents an algorithm to efficiently find arbitrarily positioned co-clusters from complex data. Second, it extends this co-clustering algorithm to discover discriminative co-clusters by incorporating the class information into the co-cluster search process. In addition, we also characterize the discriminative co-clusters and propose three novel measures that can be used to evaluate the performance of any discriminative subspace pattern mining algorithm. We evaluated the proposed algorithms on several synthetic and real gene expression datasets, and our experimental results showed that the proposed algorithms outperformed several existing algorithms available in the literature.

摘要

判别模型用于分析两类之间的差异并识别特定类别的模式。大多数现有的判别模型依赖于使用整个特征空间来计算每个类别的判别模式。协同聚类已被提出用于捕获在特征子集中相关的模式,但它无法处理标记数据集中的判别模式。在某些生物应用中,如基因表达分析,考虑仅在特征空间子集中相关的判别模式至关重要。本文的目标有两个:首先,提出一种算法,用于从复杂数据中高效地找到任意位置的协同聚类。其次,通过将类信息纳入协同聚类搜索过程,扩展此协同聚类算法以发现判别协同聚类。此外,我们还对判别协同聚类进行了表征,并提出了三种新颖的度量,可用于评估任何判别子空间模式挖掘算法的性能。我们在几个合成和真实的基因表达数据集上评估了所提出的算法,实验结果表明所提出的算法优于文献中现有的几种算法。

相似文献

1
Efficient Mining of Discriminative Co-clusters from Gene Expression Data.
Knowl Inf Syst. 2014 Dec;41(3):667-696. doi: 10.1007/s10115-013-0684-0.
2
Noise-robust unsupervised spike sorting based on discriminative subspace learning with outlier handling.
J Neural Eng. 2017 Jun;14(3):036003. doi: 10.1088/1741-2552/aa6089. Epub 2017 Feb 15.
3
Discriminative sparse subspace learning and its application to unsupervised feature selection.
ISA Trans. 2016 Mar;61:104-118. doi: 10.1016/j.isatra.2015.12.011. Epub 2016 Jan 20.
4
Subspace Weighting Co-Clustering of Gene Expression Data.
IEEE/ACM Trans Comput Biol Bioinform. 2019 Mar-Apr;16(2):352-364. doi: 10.1109/TCBB.2017.2705686. Epub 2017 May 18.
5
Efficiently mining time-delayed gene expression patterns.
IEEE Trans Syst Man Cybern B Cybern. 2010 Apr;40(2):400-11. doi: 10.1109/TSMCB.2009.2025564. Epub 2009 Oct 30.
6
Unsupervised fuzzy pattern discovery in gene expression data.
BMC Bioinformatics. 2011;12 Suppl 5(Suppl 5):S5. doi: 10.1186/1471-2105-12-S5-S5. Epub 2011 Jul 27.
7
Discriminative Feature Selection for Uncertain Graph Classification.
Proc SIAM Int Conf Data Min. 2013;2013:82-93. doi: 10.1137/1.9781611972832.10.
8
Microarray data mining using landmark gene-guided clustering.
BMC Bioinformatics. 2008 Feb 11;9:92. doi: 10.1186/1471-2105-9-92.
9
Integrating biological knowledge based on functional annotations for biclustering of gene expression data.
Comput Methods Programs Biomed. 2015 May;119(3):163-80. doi: 10.1016/j.cmpb.2015.02.010. Epub 2015 Mar 18.
10
Discovering biclusters in gene expression data based on high-dimensional linear geometries.
BMC Bioinformatics. 2008 Apr 23;9:209. doi: 10.1186/1471-2105-9-209.

引用本文的文献

1
Biclustering data analysis: a comprehensive survey.
Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae342.
3
BicNET: Flexible module discovery in large-scale biological networks using biclustering.
Algorithms Mol Biol. 2016 May 20;11:14. doi: 10.1186/s13015-016-0074-8. eCollection 2016.
4
A composite model for subgroup identification and prediction via bicluster analysis.
PLoS One. 2014 Oct 27;9(10):e111318. doi: 10.1371/journal.pone.0111318. eCollection 2014.

本文引用的文献

1
DeBi: Discovering Differentially Expressed Biclusters using a Frequent Itemset Approach.
Algorithms Mol Biol. 2011 Jun 23;6(1):18. doi: 10.1186/1748-7188-6-18.
3
Identification of differentially expressed gene modules between two-class DNA microarray data.
Bioinformation. 2009 Oct 11;4(4):134-7. doi: 10.6026/97320630004134.
5
Coclustering of human cancer microarrays using Minimum Sum-Squared Residue coclustering.
IEEE/ACM Trans Comput Biol Bioinform. 2008 Jul-Sep;5(3):385-400. doi: 10.1109/TCBB.2007.70268.
6
TRUST-TECH-based expectation maximization for learning finite mixture models.
IEEE Trans Pattern Anal Mach Intell. 2008 Jul;30(7):1146-57. doi: 10.1109/TPAMI.2007.70775.
7
Biclustering algorithms for biological data analysis: a survey.
IEEE/ACM Trans Comput Biol Bioinform. 2004 Jan-Mar;1(1):24-45. doi: 10.1109/TCBB.2004.2.
8
A systematic comparison and evaluation of biclustering methods for gene expression data.
Bioinformatics. 2006 May 1;22(9):1122-9. doi: 10.1093/bioinformatics/btl060. Epub 2006 Feb 24.
9
Biclustering in gene expression data by tendency.
Proc IEEE Comput Syst Bioinform Conf. 2004:182-93. doi: 10.1109/csb.2004.1332431.
10
Defining transcription modules using large-scale gene expression data.
Bioinformatics. 2004 Sep 1;20(13):1993-2003. doi: 10.1093/bioinformatics/bth166. Epub 2004 Mar 25.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验