Suppr超能文献

用于识别基因表达数据中显著模块的在线调整进化双聚类算法。

Online-adjusted evolutionary biclustering algorithm to identify significant modules in gene expression data.

作者信息

Galindo-Hernández Raúl, Rodríguez-Vázquez Katya, Galán-Vásquez Edgardo, Hernández Castellanos Carlos Ignacio

机构信息

Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Circuito Escolar, Ciudad Universitaria, 04510 Mexico city, México.

出版信息

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae681.

Abstract

Analyzing gene expression data helps the identification of significant biological relationships in genes. With a growing number of open biological datasets available, it is paramount to use reliable and innovative methods to perform in-depth analyses of biological data and ensure that informed decisions are made based on accurate information. Evolutionary algorithms have been successful in the analysis of biological datasets. However, there is still room for improvement, and further analysis should be conducted. In this work, we propose Online-Adjusted EVOlutionary Biclustering algorithm (OAEVOB), a novel evolutionary-based biclustering algorithm that efficiently handles vast gene expression data. OAEVOB incorporates an online-adjustment feature that efficiently identifies significant groups by updating the mutation probability and crossover parameters. We utilize measurements such as Pearson correlation, distance correlation, biweight midcorrelation, and mutual information to assess the similarity of genes in the biclusters. Algorithms in the specialized literature do not address generalization to diverse gene expression sources. Therefore, to evaluate OAEVOB's performance, we analyzed six gene expression datasets obtained from diverse sequencing data sources, specifically Deoxyribonucleic Acid microarray, Ribonucleic Acid (RNA) sequencing, and single-cell RNA sequencing, which are subject to a thorough examination. OAEVOB identified significant broad gene expression biclusters with correlations greater than $0.5$ across all similarity measurements employed. Additionally, when biclusters are evaluated by functional enrichment analysis, they exhibit biological functions, suggesting that OAEVOB effectively identifies biclusters with specific cancer and tissue-related genes in the analyzed datasets. We compared the OAEVOB's performance with state-of-the-art methods and outperformed them showing robustness to noise, overlapping, sequencing data sources, and gene coverage.

摘要

分析基因表达数据有助于识别基因中重要的生物学关系。随着可用的开放生物数据集数量不断增加,使用可靠且创新的方法对生物数据进行深入分析,并确保基于准确信息做出明智决策至关重要。进化算法在生物数据集分析方面已取得成功。然而,仍有改进空间,应进行进一步分析。在这项工作中,我们提出了在线调整进化双聚类算法(OAEVOB),这是一种基于进化的新型双聚类算法,能够高效处理海量基因表达数据。OAEVOB 纳入了在线调整功能,通过更新变异概率和交叉参数来有效识别重要的基因簇。我们利用诸如皮尔逊相关系数、距离相关系数、双权中相关系数和互信息等度量来评估双聚类中基因的相似性。专业文献中的算法未涉及对不同基因表达来源的泛化问题。因此,为评估 OAEVOB 的性能,我们分析了从不同测序数据源获得的六个基因表达数据集,具体为脱氧核糖核酸微阵列、核糖核酸(RNA)测序和单细胞 RNA 测序,并对其进行了全面检查。OAEVOB 在所有使用的相似性度量中都识别出了相关性大于 0.5 的重要广义基因表达双聚类。此外,当通过功能富集分析评估双聚类时,它们展现出生物学功能,这表明 OAEVOB 在分析的数据集中有效地识别出了具有特定癌症和组织相关基因的双聚类。我们将 OAEVOB 的性能与现有最先进方法进行了比较,结果表明 OAEVOB 表现更优,对噪声、重叠、测序数据源和基因覆盖具有鲁棒性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c2c/11695933/573138ceb693/bbae681f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验