DST-FIST Bioinformatics Lab, Department of Computer Science and Engineering, International Institute of Information Technology (IIIT), Bhubaneswar, India.
IET Syst Biol. 2020 Dec;14(6):323-333. doi: 10.1049/iet-syb.2020.0024.
Computational analysis of microarray data is crucial for understanding the gene behaviours and deriving meaningful results. Clustering and biclustering of gene expression microarray data in the unsupervised domain are extremely important as their outcomes directly dominate healthcare research in many aspects. However, these approaches fail when the time factor is added as the third dimension to the microarray datasets. This three-dimensional data set can be analysed using triclustering that discovers similar gene sets that pursue identical behaviour under a subset of conditions at a specific time point. A novel triclustering algorithm (TriRNSC) is proposed in this manuscript to discover meaningful triclusters in gene expression profiles. TriRNSC is based on restricted neighbourhood search clustering (RNSC), a popular graph-based clustering approach considering the genes, the experimental conditions and the time points at an instance. The performance of the proposed algorithm is evaluated in terms of volume and some performance measures. Gene Ontology and KEGG pathway analysis are used to validate the TriRNSC results biologically. The efficiency of TriRNSC indicates its capability and reliability and also demonstrates its usability over other state-of-art schemes. The proposed framework initiates the application of the RNSC algorithm in the triclustering of gene expression profiles.
微阵列数据分析的计算分析对于理解基因行为和得出有意义的结果至关重要。在无监督领域对基因表达微阵列数据进行聚类和双聚类非常重要,因为它们的结果直接主导着许多方面的医疗保健研究。然而,当时间因素作为微阵列数据集的第三个维度添加时,这些方法就会失败。可以使用三聚类来分析这个三维数据集,该方法可以发现相似的基因集,这些基因集在特定时间点的一组条件下表现出相同的行为。本文提出了一种新的三聚类算法(TriRNSC),用于发现基因表达谱中的有意义的三聚类。TriRNSC 基于受限邻域搜索聚类(RNSC),这是一种流行的基于图的聚类方法,考虑了基因、实验条件和时间点。根据体积和一些性能指标来评估所提出算法的性能。使用基因本体论和 KEGG 通路分析对 TriRNSC 结果进行生物学验证。TriRNSC 的效率表明了它的能力和可靠性,也证明了它在其他最先进的方案中的可用性。所提出的框架将 RNSC 算法应用于基因表达谱的三聚类中。