Department of Computer Science and Engineering, School of Engineering, Tezpur University, Napaam, Sonitpur, Assam 784028, India.
Department of Computer Science, University of Colorado, Colorado Springs, USA.
Comput Biol Chem. 2018 Aug;75:154-167. doi: 10.1016/j.compbiolchem.2018.05.007. Epub 2018 May 7.
Developing a cost-effective and robust triclustering algorithm that can identify triclusters of high biological significance in the gene-sample-time (GST) domain is a challenging task. Most existing triclustering algorithms can detect shifting and scaling patterns in isolation, they are not able to handle co-occurring shifting-and-scaling patterns. This paper makes an attempt to address this issue. It introduces a robust triclustering algorithm called THD-Tricluster to identify triclusters over the GST domain. In addition to applying over several benchmark datasets for its validation, the proposed THD-Tricluster algorithm was applied on HIV-1 progression data to identify disease-specific genes. THD-Tricluster could identify 38 most responsible genes for the deadly disease which includes GATA3, EGR1, JUN, ELF1, AGFG1, AGFG2, CX3CR1, CXCL12, CCR5, CCR2, and many others. The results are validated using GeneCard and other established results.
开发一种经济高效且稳健的三聚类算法,以识别 GST 域中具有高生物学意义的三聚类,是一项具有挑战性的任务。大多数现有的三聚类算法可以单独检测到移位和缩放模式,但它们无法处理同时存在的移位和缩放模式。本文试图解决这个问题。它引入了一种名为 THD-Tricluster 的稳健三聚类算法,用于识别 GST 域上的三聚类。除了在几个基准数据集上进行验证外,还将所提出的 THD-Tricluster 算法应用于 HIV-1 进展数据,以识别疾病特异性基因。THD-Tricluster 可以识别 38 个对致命疾病最有责任的基因,包括 GATA3、EGR1、JUN、ELF1、AGFG1、AGFG2、CX3CR1、CXCL12、CCR5、CCR2 等。结果使用 GeneCard 和其他已建立的结果进行了验证。