Suppr超能文献

通过多层次特征对比和匹配进行战略多组学数据集成。

Strategic Multi-Omics Data Integration via Multi-Level Feature Contrasting and Matching.

出版信息

IEEE Trans Nanobioscience. 2024 Oct;23(4):579-590. doi: 10.1109/TNB.2024.3456797. Epub 2024 Oct 15.

Abstract

The analysis and comprehension of multi-omics data has emerged as a prominent topic in the field of bioinformatics and data science. However, the sparsity characteristics and high dimensionality of omics data pose difficulties in terms of extracting meaningful information. Moreover, the heterogeneity inherent in multiple omics sources makes the effective integration of multi-omics data challenging To tackle these challenges, we propose MFCC-SAtt, a multi-level feature contrast clustering model based on self-attention to extract informative features from multi-omics data. MFCC-SAtt treats each omics type as a distinct modality and employs autoencoders with self-attention for each modality to integrate and compress their respective features into a shared feature space. By utilizing a multi-level feature extraction framework along with incorporating a semantic information extractor, we mitigate optimization conflicts arising from different learning objectives. Additionally, MFCC-SAtt guides deep clustering based on multi-level features which further enhances the quality of output labels. By conducting extensive experiments on multi-omics data, we have validated the exceptional performance of MFCC-SAtt. For instance, in a pan-cancer clustering task, MFCC-SAtt achieved an accuracy of over 80.38%.

摘要

多组学数据的分析和理解已经成为生物信息学和数据科学领域的一个突出课题。然而,组学数据的稀疏性和高维性给提取有意义的信息带来了困难。此外,多个组学源的固有异质性使得多组学数据的有效整合具有挑战性。为了应对这些挑战,我们提出了 MFCC-SAtt,这是一种基于自注意力的多层次特征对比聚类模型,用于从多组学数据中提取信息丰富的特征。MFCC-SAtt 将每种组学类型视为一个独特的模态,并为每个模态使用带有自注意力的自动编码器,将它们各自的特征集成并压缩到一个共享的特征空间中。通过使用多层次特征提取框架,并结合语义信息提取器,我们减轻了不同学习目标带来的优化冲突。此外,MFCC-SAtt 基于多层次特征引导深度聚类,进一步提高了输出标签的质量。通过在多组学数据上进行广泛的实验,我们验证了 MFCC-SAtt 的出色性能。例如,在泛癌聚类任务中,MFCC-SAtt 达到了超过 80.38%的准确率。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验