Suppr超能文献

使用共识分区矩阵二值化(Bi-CoPaM)进行可调聚类的范例,用于基因发现。

Paradigm of tunable clustering using Binarization of Consensus Partition Matrices (Bi-CoPaM) for gene discovery.

机构信息

Department of Electrical Engineering and Electronics, The University of Liverpool, Brownlow Hill, Liverpool, United Kingdom.

出版信息

PLoS One. 2013;8(2):e56432. doi: 10.1371/journal.pone.0056432. Epub 2013 Feb 11.

Abstract

Clustering analysis has a growing role in the study of co-expressed genes for gene discovery. Conventional binary and fuzzy clustering do not embrace the biological reality that some genes may be irrelevant for a problem and not be assigned to a cluster, while other genes may participate in several biological functions and should simultaneously belong to multiple clusters. Also, these algorithms cannot generate tight clusters that focus on their cores or wide clusters that overlap and contain all possibly relevant genes. In this paper, a new clustering paradigm is proposed. In this paradigm, all three eventualities of a gene being exclusively assigned to a single cluster, being assigned to multiple clusters, and being not assigned to any cluster are possible. These possibilities are realised through the primary novelty of the introduction of tunable binarization techniques. Results from multiple clustering experiments are aggregated to generate one fuzzy consensus partition matrix (CoPaM), which is then binarized to obtain the final binary partitions. This is referred to as Binarization of Consensus Partition Matrices (Bi-CoPaM). The method has been tested with a set of synthetic datasets and a set of five real yeast cell-cycle datasets. The results demonstrate its validity in generating relevant tight, wide, and complementary clusters that can meet requirements of different gene discovery studies.

摘要

聚类分析在研究共表达基因以发现基因方面发挥着越来越重要的作用。传统的二值和模糊聚类不接受这样一种生物学现实,即有些基因可能与问题无关,因此不被分配到一个聚类中,而有些基因可能参与多种生物学功能,应该同时属于多个聚类。此外,这些算法无法生成专注于核心的紧密聚类或重叠并包含所有可能相关基因的宽聚类。在本文中,提出了一种新的聚类范例。在这个范例中,一个基因被专门分配到一个聚类、被分配到多个聚类、以及不被分配到任何聚类的所有三种可能性都是可能的。这些可能性是通过引入可调二值化技术的主要新颖性来实现的。通过对多个聚类实验的结果进行聚合,生成一个模糊共识分区矩阵(CoPaM),然后对其进行二值化以获得最终的二进制分区。这被称为共识分区矩阵的二值化(Bi-CoPaM)。该方法已在一组合成数据集和五组酵母细胞周期数据集上进行了测试。结果表明,该方法在生成满足不同基因发现研究要求的相关紧密、宽和互补聚类方面是有效的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a46/3569426/9b14d8317265/pone.0056432.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验