Suppr超能文献

基于深度学习的多组学数据整合与分析方法。

Deep learning-based approaches for multi-omics data integration and analysis.

作者信息

Ballard Jenna L, Wang Zexuan, Li Wenrui, Shen Li, Long Qi

机构信息

Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, 3700 Hamilton Walk, Philadelphia, PA, 19104, USA.

Graduate Group in Applied Mathematics and Computational Science, University of Pennsylvania, 209 S. 33rd Street, Philadelphia, PA, 19104, USA.

出版信息

BioData Min. 2024 Oct 2;17(1):38. doi: 10.1186/s13040-024-00391-z.

Abstract

BACKGROUND

The rapid growth of deep learning, as well as the vast and ever-growing amount of available data, have provided ample opportunity for advances in fusion and analysis of complex and heterogeneous data types. Different data modalities provide complementary information that can be leveraged to gain a more complete understanding of each subject. In the biomedical domain, multi-omics data includes molecular (genomics, transcriptomics, proteomics, epigenomics, metabolomics, etc.) and imaging (radiomics, pathomics) modalities which, when combined, have the potential to improve performance on prediction, classification, clustering and other tasks. Deep learning encompasses a wide variety of methods, each of which have certain strengths and weaknesses for multi-omics integration.

METHOD

In this review, we categorize recent deep learning-based approaches by their basic architectures and discuss their unique capabilities in relation to one another. We also discuss some emerging themes advancing the field of multi-omics integration.

RESULTS

Deep learning-based multi-omics integration methods were categorized broadly into non-generative (feedforward neural networks, graph convolutional neural networks, and autoencoders) and generative (variational methods, generative adversarial models, and a generative pretrained model). Generative methods have the advantage of being able to impose constraints on the shared representations to enforce certain properties or incorporate prior knowledge. They can also be used to generate or impute missing modalities. Recent advances achieved by these methods include the ability to handle incomplete data as well as going beyond the traditional molecular omics data types to integrate other modalities such as imaging data.

CONCLUSION

We expect to see further growth in methods that can handle missingness, as this is a common challenge in working with complex and heterogeneous data. Additionally, methods that integrate more data types are expected to improve performance on downstream tasks by capturing a comprehensive view of each sample.

摘要

背景

深度学习的快速发展,以及海量且不断增长的可用数据,为复杂和异构数据类型的融合与分析取得进展提供了充足的机会。不同的数据模态提供互补信息,可借此更全面地了解每个研究对象。在生物医学领域,多组学数据包括分子(基因组学、转录组学、蛋白质组学、表观基因组学、代谢组学等)和成像(放射组学、病理组学)模态,将它们结合起来有潜力提高预测、分类、聚类及其他任务的性能。深度学习涵盖多种方法,每种方法在多组学整合方面都有一定的优缺点。

方法

在本综述中,我们根据其基本架构对近期基于深度学习的方法进行分类,并讨论它们彼此相关的独特能力。我们还讨论了推动多组学整合领域发展的一些新趋势。

结果

基于深度学习的多组学整合方法大致分为非生成式(前馈神经网络、图卷积神经网络和自动编码器)和生成式(变分方法、生成对抗模型和生成式预训练模型)。生成式方法的优势在于能够对共享表示施加约束,以强制实现某些属性或纳入先验知识。它们还可用于生成或插补缺失的模态。这些方法最近取得的进展包括处理不完整数据的能力,以及超越传统分子组学数据类型以整合其他模态(如图像数据)的能力。

结论

我们预计能够处理缺失值的方法会进一步发展,因为这是处理复杂和异构数据时的一个常见挑战。此外,整合更多数据类型的方法有望通过全面了解每个样本,提高下游任务的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f76f/11446004/544d2fa4fcb9/13040_2024_391_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验