Suppr超能文献

一种由先验生物学知识引导的多目标基因聚类算法,具备强化和多样化策略。

A multi-objective gene clustering algorithm guided by apriori biological knowledge with intensification and diversification strategies.

作者信息

Parraga-Alava Jorge, Dorn Marcio, Inostroza-Ponta Mario

机构信息

1Centre for Biotechnology and Bioengineering (CeBiB), Departamento de Ingeniería Informática, Universidad de Santiago de Chile, Av. Ecuador 3659, Santiago, Chile.

2Carrera de Computación, Escuela Superior Politécnica Agropecuaria de Manabí Manuel Félix López, Campus Politécnico Sitio El Limón, Calceta, Ecuador.

出版信息

BioData Min. 2018 Aug 7;11:16. doi: 10.1186/s13040-018-0178-4. eCollection 2018.

Abstract

BACKGROUND

Biologists aim to understand the genetic background of diseases, metabolic disorders or any other genetic condition. Microarrays are one of the main high-throughput technologies for collecting information about the behaviour of genetic information on different conditions. In order to analyse this data, clustering arises as one of the main techniques used, and it aims at finding groups of genes that have some criterion in common, like similar expression profile. However, the problem of finding groups is normally multi dimensional, making necessary to approach the clustering as a multi-objective problem where various cluster validity indexes are simultaneously optimised. They are usually based on criteria like compactness and separation, which may not be sufficient since they can not guarantee the generation of clusters that have both similar expression patterns and biological coherence.

METHOD

We propose a Multi-Objective Clustering algorithm Guided by a-Priori Biological Knowledge (MOC-GaPBK) to find clusters of genes with high levels of co-expression, biological coherence, and also good compactness and separation. Cluster quality indexes are used to optimise simultaneously gene relationships at expression level and biological functionality. Our proposal also includes intensification and diversification strategies to improve the search process.

RESULTS

The effectiveness of the proposed algorithm is demonstrated on four publicly available datasets. Comparative studies of the use of different objective functions and other widely used microarray clustering techniques are reported. Statistical, visual and biological significance tests are carried out to show the superiority of the proposed algorithm.

CONCLUSIONS

Integrating a-priori biological knowledge into a multi-objective approach and using intensification and diversification strategies allow the proposed algorithm to find solutions with higher quality than other microarray clustering techniques available in the literature in terms of co-expression, biological coherence, compactness and separation.

摘要

背景

生物学家旨在了解疾病、代谢紊乱或任何其他遗传病症的遗传背景。微阵列是用于收集有关不同条件下遗传信息行为的信息的主要高通量技术之一。为了分析这些数据,聚类成为主要使用的技术之一,其目的是找到具有某些共同标准(如相似表达谱)的基因群体。然而,寻找群体的问题通常是多维度的,这使得有必要将聚类作为一个多目标问题来处理,在这个问题中,各种聚类有效性指标会同时得到优化。它们通常基于紧凑性和分离性等标准,但这些标准可能并不充分,因为它们无法保证生成既具有相似表达模式又具有生物学连贯性的聚类。

方法

我们提出了一种由先验生物学知识引导的多目标聚类算法(MOC - GaPBK),以找到具有高共表达水平、生物学连贯性以及良好紧凑性和分离性的基因聚类。聚类质量指标用于同时优化表达水平上的基因关系和生物学功能。我们的提议还包括强化和多样化策略,以改进搜索过程。

结果

在四个公开可用的数据集上证明了所提出算法的有效性。报告了对不同目标函数的使用以及其他广泛使用的微阵列聚类技术的比较研究。进行了统计、可视化和生物学意义测试,以显示所提出算法的优越性。

结论

将先验生物学知识整合到多目标方法中,并使用强化和多样化策略,使得所提出的算法能够找到比文献中其他可用的微阵列聚类技术在共表达、生物学连贯性、紧凑性和分离性方面质量更高的解决方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b5d/6081857/282858f98af7/13040_2018_178_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验