Pagnuco Inti A, Pastore Juan I, Abras Guillermo, Brun Marcel, Ballarin Virginia L
Digital Image Processing Lab., ICyTE, UNMdP, Argentina; Department of Mathematics, School of Engineering, UNMdP, Argentina; CONICET, Argentina.
Digital Image Processing Lab., ICyTE, UNMdP, Argentina; Department of Mathematics, School of Engineering, UNMdP, Argentina; CONICET, Argentina.
Genomics. 2017 Oct;109(5-6):438-445. doi: 10.1016/j.ygeno.2017.06.009. Epub 2017 Jul 8.
It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, based on some criteria of similarity. This task is usually performed by clustering algorithms, where the genes are clustered into meaningful groups based on their expression values in a set of experiment. In this work, we propose a method to find sets of co-expressed genes, based on cluster validation indices as a measure of similarity for individual gene groups, and a combination of variants of hierarchical clustering to generate the candidate groups. We evaluated its ability to retrieve significant sets on simulated correlated and real genomics data, where the performance is measured based on its detection ability of co-regulated sets against a full search. Additionally, we analyzed the quality of the best ranked groups using an online bioinformatics tool that provides network information for the selected genes.
通常认为,共表达基因表明在潜在调控网络中存在共同调控。基于某些相似性标准,确定共表达基因集是一项重要任务。此任务通常由聚类算法执行,其中基因根据它们在一组实验中的表达值被聚类为有意义的组。在这项工作中,我们提出了一种方法来寻找共表达基因集,该方法基于聚类验证指标作为单个基因组相似性的度量,并结合层次聚类的变体来生成候选组。我们在模拟相关和真实基因组数据上评估了其检索显著集的能力,其中性能是基于其相对于全搜索对共调控集的检测能力来衡量的。此外,我们使用一个在线生物信息学工具分析了排名最佳的组的质量,该工具为所选基因提供网络信息。