• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

无监督多内核学习在异类数据集成中的应用。

Unsupervised multiple kernel learning for heterogeneous data integration.

机构信息

MIAT, Université de Toulouse, INRA, 31326 Castanet-Tolosan, France.

出版信息

Bioinformatics. 2018 Mar 15;34(6):1009-1015. doi: 10.1093/bioinformatics/btx682.

DOI:10.1093/bioinformatics/btx682
PMID:29077792
Abstract

MOTIVATION

Recent high-throughput sequencing advances have expanded the breadth of available omics datasets and the integrated analysis of multiple datasets obtained on the same samples has allowed to gain important insights in a wide range of applications. However, the integration of various sources of information remains a challenge for systems biology since produced datasets are often of heterogeneous types, with the need of developing generic methods to take their different specificities into account.

RESULTS

We propose a multiple kernel framework that allows to integrate multiple datasets of various types into a single exploratory analysis. Several solutions are provided to learn either a consensus meta-kernel or a meta-kernel that preserves the original topology of the datasets. We applied our framework to analyse two public multi-omics datasets. First, the multiple metagenomic datasets, collected during the TARA Oceans expedition, was explored to demonstrate that our method is able to retrieve previous findings in a single kernel PCA as well as to provide a new image of the sample structures when a larger number of datasets are included in the analysis. To perform this analysis, a generic procedure is also proposed to improve the interpretability of the kernel PCA in regards with the original data. Second, the multi-omics breast cancer datasets, provided by The Cancer Genome Atlas, is analysed using a kernel Self-Organizing Maps with both single and multi-omics strategies. The comparison of these two approaches demonstrates the benefit of our integration method to improve the representation of the studied biological system.

AVAILABILITY AND IMPLEMENTATION

Proposed methods are available in the R package mixKernel, released on CRAN. It is fully compatible with the mixOmics package and a tutorial describing the approach can be found on mixOmics web site http://mixomics.org/mixkernel/.

CONTACT

jerome.mariette@inra.fr or nathalie.villa-vialaneix@inra.fr.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

最近高通量测序技术的进步扩大了可用的组学数据集的范围,对同一样本获得的多个数据集进行综合分析,使得在广泛的应用中获得了重要的见解。然而,由于产生的数据集通常具有异构类型,因此整合各种来源的信息仍然是系统生物学的一个挑战,需要开发通用方法来考虑它们的不同特性。

结果

我们提出了一个多内核框架,允许将各种类型的多个数据集集成到单个探索性分析中。提供了几种解决方案来学习一致的元核或保留数据集原始拓扑的元核。我们将我们的框架应用于分析两个公开的多组学数据集。首先,探索了在 TARA 海洋考察期间收集的多个宏基因组数据集,以证明我们的方法不仅能够在单个核 PCA 中检索以前的发现,而且当分析中包含更多数据集时,还能够提供样本结构的新图像。为了执行此分析,还提出了一种通用程序来提高核 PCA 与原始数据的可解释性。其次,使用核自组织映射分析了来自癌症基因组图谱的多组学乳腺癌数据集,使用了单组学和多组学策略。这两种方法的比较证明了我们的集成方法能够改善所研究的生物系统的表示能力。

可用性和实现

所提出的方法可在 R 包 mixKernel 中使用,该包已在 CRAN 上发布。它与 mixOmics 包完全兼容,有关该方法的教程可在 mixOmics 网站 http://mixomics.org/mixkernel/ 上找到。

联系方式

jerome.mariette@inra.fr 或 nathalie.villa-vialaneix@inra.fr。

补充信息

补充数据可在生物信息学在线获得。

相似文献

1
Unsupervised multiple kernel learning for heterogeneous data integration.无监督多内核学习在异类数据集成中的应用。
Bioinformatics. 2018 Mar 15;34(6):1009-1015. doi: 10.1093/bioinformatics/btx682.
2
Feature selection for kernel methods in systems biology.系统生物学中核方法的特征选择
NAR Genom Bioinform. 2022 Mar 7;4(1):lqac014. doi: 10.1093/nargab/lqac014. eCollection 2022 Mar.
3
integrOmics: an R package to unravel relationships between two omics datasets.integrOmics:一个用于揭示两个组学数据集之间关系的 R 包。
Bioinformatics. 2009 Nov 1;25(21):2855-6. doi: 10.1093/bioinformatics/btp515. Epub 2009 Aug 25.
4
DeepDRK: a deep learning framework for drug repurposing through kernel-based multi-omics integration.DeepDRK:一种基于核的多组学整合的药物重定位深度学习框架。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab048.
5
Fast and interpretable genomic data analysis using multiple approximate kernel learning.使用多种近似核学习进行快速且可解释的基因组数据分析。
Bioinformatics. 2022 Jun 24;38(Suppl 1):i77-i83. doi: 10.1093/bioinformatics/btac241.
6
Deviance residuals-based sparse PLS and sparse kernel PLS regression for censored data.基于偏差残差的稀疏偏最小二乘和稀疏核偏最小二乘回归用于删失数据。
Bioinformatics. 2015 Feb 1;31(3):397-404. doi: 10.1093/bioinformatics/btu660. Epub 2014 Oct 6.
7
Clustering and variable selection evaluation of 13 unsupervised methods for multi-omics data integration.用于多组学数据整合的13种无监督方法的聚类和变量选择评估
Brief Bioinform. 2020 Dec 1;21(6):2011-2030. doi: 10.1093/bib/bbz138.
8
mixOmics: An R package for 'omics feature selection and multiple data integration.mixOmics:一个用于“组学”特征选择和多数据整合的R包。
PLoS Comput Biol. 2017 Nov 3;13(11):e1005752. doi: 10.1371/journal.pcbi.1005752. eCollection 2017 Nov.
9
A powerful framework for an integrative study with heterogeneous omics data: from univariate statistics to multi-block analysis.一种用于整合异质组学数据研究的强大框架:从单变量统计到多块分析。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa166.
10
Kernel-PCA data integration with enhanced interpretability.具有增强可解释性的核主成分分析数据集成。
BMC Syst Biol. 2014;8 Suppl 2(Suppl 2):S6. doi: 10.1186/1752-0509-8-S2-S6. Epub 2014 Mar 13.

引用本文的文献

1
Effective integration of multi-omics with prior knowledge to identify biomarkers via explainable graph neural networks.通过可解释图神经网络将多组学与先验知识有效整合以识别生物标志物。
NPJ Syst Biol Appl. 2025 May 8;11(1):43. doi: 10.1038/s41540-025-00519-9.
2
Impact of foliar application of phyllosphere yeast strains combined with soil fertilizer application on rice growth and yield.叶面喷施叶际酵母菌株结合土壤施肥对水稻生长和产量的影响。
Environ Microbiome. 2024 Dec 18;19(1):102. doi: 10.1186/s40793-024-00635-9.
3
Deep learning-based approaches for multi-omics data integration and analysis.
基于深度学习的多组学数据整合与分析方法。
BioData Min. 2024 Oct 2;17(1):38. doi: 10.1186/s13040-024-00391-z.
4
An explainable graph neural network approach for effectively integrating multi-omics with prior knowledge to identify biomarkers from interacting biological domains.一种可解释的图神经网络方法,用于有效整合多组学与先验知识,以从相互作用的生物领域中识别生物标志物。
bioRxiv. 2024 Sep 26:2024.08.23.609465. doi: 10.1101/2024.08.23.609465.
5
Application of Mass Cytometry Platforms to Solid Organ Transplantation.应用液质联用平台进行实体器官移植。
Transplantation. 2024 Oct 1;108(10):2034-2044. doi: 10.1097/TP.0000000000004925. Epub 2024 Mar 12.
6
MEMMAL: A tool for expanding large-scale mechanistic models with machine learned associations and big datasets.MEMMAL:一种利用机器学习关联和大数据集扩展大规模机制模型的工具。
Front Syst Biol. 2023;3. doi: 10.3389/fsysb.2023.1099413. Epub 2023 Mar 9.
7
A toolbox of machine learning software to support microbiome analysis.一个支持微生物组分析的机器学习软件工具箱。
Front Microbiol. 2023 Nov 22;14:1250806. doi: 10.3389/fmicb.2023.1250806. eCollection 2023.
8
Imaging and multi-omics datasets converge to define different neural progenitor origins for ATRT-SHH subgroups.影像学和多组学数据集的融合定义了 ATRT-SHH 亚组不同的神经前体细胞起源。
Nat Commun. 2023 Oct 20;14(1):6669. doi: 10.1038/s41467-023-42371-7.
9
Asterics: a simple tool for the ExploRation and Integration of omiCS data.Asterics:一种用于探索和整合组学数据的简单工具。
BMC Bioinformatics. 2023 Oct 18;24(1):391. doi: 10.1186/s12859-023-05504-9.
10
Improvement of variables interpretability in kernel PCA.核主成分分析中变量可解释性的改进。
BMC Bioinformatics. 2023 Jul 12;24(1):282. doi: 10.1186/s12859-023-05404-y.