Pontes Beatriz, Giráldez Raúl, Aguilar-Ruiz Jesús S
Department of Computer Languages, University of Seville, Seville, Spain.
Algorithms Mol Biol. 2013 Feb 23;8(1):4. doi: 10.1186/1748-7188-8-4.
Biclustering algorithms for microarray data aim at discovering functionally related gene sets under different subsets of experimental conditions. Due to the problem complexity and the characteristics of microarray datasets, heuristic searches are usually used instead of exhaustive algorithms. Also, the comparison among different techniques is still a challenge. The obtained results vary in relevant features such as the number of genes or conditions, which makes it difficult to carry out a fair comparison. Moreover, existing approaches do not allow the user to specify any preferences on these properties.
Here, we present the first biclustering algorithm in which it is possible to particularize several biclusters features in terms of different objectives. This can be done by tuning the specified features in the algorithm or also by incorporating new objectives into the search. Furthermore, our approach bases the bicluster evaluation in the use of expression patterns, being able to recognize both shifting and scaling patterns either simultaneously or not. Evolutionary computation has been chosen as the search strategy, naming thus our proposal Evo-Bexpa (Evolutionary Biclustering based in Expression Patterns).
We have conducted experiments on both synthetic and real datasets demonstrating Evo-Bexpa abilities to obtain meaningful biclusters. Synthetic experiments have been designed in order to compare Evo-Bexpa performance with other approaches when looking for perfect patterns. Experiments with four different real datasets also confirm the proper performing of our algorithm, whose results have been biologically validated through Gene Ontology.
用于微阵列数据的双聚类算法旨在发现不同实验条件子集下功能相关的基因集。由于问题的复杂性和微阵列数据集的特点,通常使用启发式搜索而非穷举算法。此外,不同技术之间的比较仍然是一个挑战。所获得的结果在相关特征(如基因数量或条件)方面存在差异,这使得难以进行公平比较。而且,现有方法不允许用户对这些属性指定任何偏好。
在此,我们提出了第一种双聚类算法,在该算法中可以根据不同目标来细化几个双聚类特征。这可以通过调整算法中指定的特征或通过将新目标纳入搜索来实现。此外,我们的方法将双聚类评估基于表达模式的使用,能够同时或不同时识别移位和缩放模式。已选择进化计算作为搜索策略,因此我们将我们的提议命名为Evo - Bexpa(基于表达模式的进化双聚类)。
我们在合成数据集和真实数据集上都进行了实验,证明了Evo - Bexpa获得有意义双聚类的能力。设计了合成实验,以便在寻找完美模式时将Evo - Bexpa的性能与其他方法进行比较。对四个不同真实数据集的实验也证实了我们算法的良好性能,其结果已通过基因本体进行了生物学验证。