一种基于无参数相对密度的双聚类方法，用于识别非线性特征关系。

A parameter free relative density based biclustering method for identifying non-linear feature relations.

作者信息

Jain Namita, Ghosh Susmita, Ghosh Ashish

机构信息

Department of Computer Science and Engineering, Jadavpur University, Kolkata 700032, India.

International Institute of Information Technology, Bhubaneswar 751003, India.

出版信息

Heliyon. 2024 Jul 20;10(15):e34736. doi: 10.1016/j.heliyon.2024.e34736. eCollection 2024 Aug 15.

DOI:10.1016/j.heliyon.2024.e34736

PMID:39157398

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11327522/

Abstract

The existing biclustering algorithms often depend on assumptions like monotonicity or linearity of feature relations for finding biclusters. Though a few algorithms overcome this problem using density-based methods, they tend to miss out many biclusters because they use global criteria for identifying dense regions. The proposed method, PF-RelDenBi, uses local variations in marginal and joint densities for each pair of features to find the subset of observations, forming the basis of the relation between them. It then finds the set of features connected by a common set of observations using a non-linear feature relation index, resulting in a bicluster. This approach allows us to find biclusters based on feature relations, even if the relations are non-linear or non-monotonous. Additionally, the proposed method does not require the user to provide any parameters, allowing its application to datasets from different domains. To study the behaviour of PF-RelDenBi on datasets with different properties, experiments were carried out on sixteen simulated datasets and the performance has been compared with eleven state-of-the-art algorithms. The proposed method is seen to produce better results for most of the simulated datasets. Experiments were conducted with five benchmark datasets and biclusters were detected using PF-RelDenBi. For the first two datasets, the detected biclusters were used to generate additional features that improved classification performance. For the other three datasets, the performance of PF-RelDenBi was compared with the eleven state-of-the-art methods in terms of accuracy, NMI and ARI. The proposed method is seen to detect biclusters with greater accuracy. The proposed technique has also been applied to the COVID-19 dataset to identify some demographic features that are likely to affect the spread of COVID-19.

摘要

现有的双聚类算法通常依赖于诸如特征关系的单调性或线性等假设来寻找双聚类。尽管有一些算法使用基于密度的方法克服了这个问题，但它们往往会遗漏许多双聚类，因为它们使用全局标准来识别密集区域。所提出的方法PF-RelDenBi，利用每对特征的边际密度和联合密度的局部变化来找到观测值的子集，形成它们之间关系的基础。然后，它使用非线性特征关系指数找到由一组共同观测值连接的特征集，从而得到一个双聚类。这种方法使我们能够基于特征关系找到双聚类，即使这些关系是非线性或非单调的。此外，所提出的方法不需要用户提供任何参数，从而允许其应用于来自不同领域的数据集。为了研究PF-RelDenBi在具有不同属性的数据集上的行为，在16个模拟数据集上进行了实验，并将性能与11种最先进的算法进行了比较。对于大多数模拟数据集，所提出的方法被认为能产生更好的结果。使用PF-RelDenBi对五个基准数据集进行了实验并检测到了双聚类。对于前两个数据集，检测到的双聚类被用于生成额外的特征，这些特征提高了分类性能。对于其他三个数据集，在准确性、归一化互信息（NMI）和调整兰德指数（ARI）方面，将PF-RelDenBi的性能与11种最先进的方法进行了比较。所提出的方法被认为能更准确地检测双聚类。所提出的技术也已应用于COVID-19数据集，以识别一些可能影响COVID-19传播的人口统计学特征。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dca3/11327522/00661e9df23e/gr001.jpg

相似文献

A parameter free relative density based biclustering method for identifying non-linear feature relations.一种基于无参数相对密度的双聚类方法，用于识别非线性特征关系。

Heliyon. 2024 Jul 20;10(15):e34736. doi: 10.1016/j.heliyon.2024.e34736. eCollection 2024 Aug 15.

Differential co-expression framework to quantify goodness of biclusters and compare biclustering algorithms.用于量化双聚类质量并比较双聚类算法的差异共表达框架。

Algorithms Mol Biol. 2010 May 28;5:23. doi: 10.1186/1748-7188-5-23.

COSCEB: Comprehensive search for column-coherent evolution biclusters and its application to hub gene identification.COSCEB：列一致进化双聚类的全面搜索及其在枢纽基因识别中的应用。

J Biosci. 2019 Jun;44(2).

A comparison and evaluation of five biclustering algorithms by quantifying goodness of biclusters for gene expression data.基于基因表达数据对五种二分聚类算法的聚类质量进行定量比较和评估。

BioData Min. 2012 Jul 23;5(1):8. doi: 10.1186/1756-0381-5-8.

Identification of bicluster regions in a binary matrix and its applications.二值矩阵中双聚类区域的识别及其应用。

PLoS One. 2013 Aug 5;8(8):e71680. doi: 10.1371/journal.pone.0071680. Print 2013.

Identification of coherent patterns in gene expression data using an efficient biclustering algorithm and parallel coordinate visualization.使用高效双聚类算法和并行坐标可视化技术识别基因表达数据中的连贯模式。

BMC Bioinformatics. 2008 Apr 23;9:210. doi: 10.1186/1471-2105-9-210.

Discovery of error-tolerant biclusters from noisy gene expression data.从嘈杂的基因表达数据中发现容错双聚类。

BMC Bioinformatics. 2011 Nov 24;12 Suppl 12(Suppl 12):S1. doi: 10.1186/1471-2105-12-S12-S1.

Discovering biclusters in gene expression data based on high-dimensional linear geometries.基于高维线性几何在基因表达数据中发现双簇。

BMC Bioinformatics. 2008 Apr 23;9:209. doi: 10.1186/1471-2105-9-209.

Topological biclustering ARTMAP for identifying within bicluster relationships.拓扑二聚类 ARTMAP 用于识别二聚类内部关系。

Neural Netw. 2023 Mar;160:34-49. doi: 10.1016/j.neunet.2022.12.010. Epub 2022 Dec 20.

BicPAM: Pattern-based biclustering for biomedical data analysis.BicPAM：用于生物医学数据分析的基于模式的双聚类

Algorithms Mol Biol. 2014 Dec 16;9(1):27. doi: 10.1186/s13015-014-0027-z. eCollection 2014.

引用本文的文献

Evolutionary Mechanism Based Conserved Gene Expression Biclustering Module Analysis for Breast Cancer Genomics.基于进化机制的乳腺癌基因组保守基因表达双聚类模块分析

Biomedicines. 2024 Sep 12;12(9):2086. doi: 10.3390/biomedicines12092086.

本文引用的文献

MESBC: A novel mutually exclusive spectral biclustering method for cancer subtyping.MESBC：一种用于癌症亚型分型的新型互斥谱双聚类方法。

Comput Biol Chem. 2024 Apr;109:108009. doi: 10.1016/j.compbiolchem.2023.108009. Epub 2023 Dec 28.

Fast algorithms for singular value decomposition and the inverse of nearly low-rank matrices.奇异值分解及近似低秩矩阵求逆的快速算法。

Natl Sci Rev. 2023 Mar 25;10(6):nwad083. doi: 10.1093/nsr/nwad083. eCollection 2023 Jun.

ARBic: an all-round biclustering algorithm for analyzing gene expression data.ARBic：一种用于分析基因表达数据的全方位双聚类算法。

NAR Genom Bioinform. 2023 Jan 31;5(1):lqad009. doi: 10.1093/nargab/lqad009. eCollection 2023 Mar.

Considering BCG vaccination to reduce the impact of COVID-19.考虑接种卡介苗以减轻新型冠状病毒肺炎的影响。

Lancet. 2020 May 16;395(10236):1545-1546. doi: 10.1016/S0140-6736(20)31025-4. Epub 2020 Apr 30.

An interactive web-based dashboard to track COVID-19 in real time.一个基于网络的交互式仪表盘，用于实时追踪新冠病毒。

Lancet Infect Dis. 2020 May;20(5):533-534. doi: 10.1016/S1473-3099(20)30120-1. Epub 2020 Feb 19.

Deep Subspace Clustering.深度子空间聚类。

IEEE Trans Neural Netw Learn Syst. 2020 Dec;31(12):5509-5521. doi: 10.1109/TNNLS.2020.2968848. Epub 2020 Nov 30.

QUBIC2: a novel and robust biclustering algorithm for analyses and interpretation of large-scale RNA-Seq data.QUbic2：一种新颖而强大的用于大规模 RNA-Seq 数据分析和解释的双聚类算法。

Bioinformatics. 2020 Feb 15;36(4):1143-1149. doi: 10.1093/bioinformatics/btz692.

runibic: a Bioconductor package for parallel row-based biclustering of gene expression data.runibic：一个用于基因表达数据的基于行的并行双向聚类的 Bioconductor 包。

Bioinformatics. 2018 Dec 15;34(24):4302-4304. doi: 10.1093/bioinformatics/bty512.

UniBic: Sequential row-based biclustering algorithm for analysis of gene expression data.UniBic：用于基因表达数据分析的基于行的序列双聚类算法。

Sci Rep. 2016 Mar 22;6:23466. doi: 10.1038/srep23466.

Shifting-and-Scaling Correlation Based Biclustering Algorithm.基于移位-缩放相关性的双聚类算法

IEEE/ACM Trans Comput Biol Bioinform. 2014 Nov-Dec;11(6):1239-52. doi: 10.1109/TCBB.2014.2323054.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种基于无参数相对密度的双聚类方法，用于识别非线性特征关系。

A parameter free relative density based biclustering method for identifying non-linear feature relations.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献