文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

无监督多内核学习在异类数据集成中的应用。

Unsupervised multiple kernel learning for heterogeneous data integration.

机构信息

MIAT, Université de Toulouse, INRA, 31326 Castanet-Tolosan, France.

出版信息

Bioinformatics. 2018 Mar 15;34(6):1009-1015. doi: 10.1093/bioinformatics/btx682.


DOI:10.1093/bioinformatics/btx682
PMID:29077792
Abstract

MOTIVATION: Recent high-throughput sequencing advances have expanded the breadth of available omics datasets and the integrated analysis of multiple datasets obtained on the same samples has allowed to gain important insights in a wide range of applications. However, the integration of various sources of information remains a challenge for systems biology since produced datasets are often of heterogeneous types, with the need of developing generic methods to take their different specificities into account. RESULTS: We propose a multiple kernel framework that allows to integrate multiple datasets of various types into a single exploratory analysis. Several solutions are provided to learn either a consensus meta-kernel or a meta-kernel that preserves the original topology of the datasets. We applied our framework to analyse two public multi-omics datasets. First, the multiple metagenomic datasets, collected during the TARA Oceans expedition, was explored to demonstrate that our method is able to retrieve previous findings in a single kernel PCA as well as to provide a new image of the sample structures when a larger number of datasets are included in the analysis. To perform this analysis, a generic procedure is also proposed to improve the interpretability of the kernel PCA in regards with the original data. Second, the multi-omics breast cancer datasets, provided by The Cancer Genome Atlas, is analysed using a kernel Self-Organizing Maps with both single and multi-omics strategies. The comparison of these two approaches demonstrates the benefit of our integration method to improve the representation of the studied biological system. AVAILABILITY AND IMPLEMENTATION: Proposed methods are available in the R package mixKernel, released on CRAN. It is fully compatible with the mixOmics package and a tutorial describing the approach can be found on mixOmics web site http://mixomics.org/mixkernel/. CONTACT: jerome.mariette@inra.fr or nathalie.villa-vialaneix@inra.fr. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

摘要

动机:最近高通量测序技术的进步扩大了可用的组学数据集的范围,对同一样本获得的多个数据集进行综合分析,使得在广泛的应用中获得了重要的见解。然而,由于产生的数据集通常具有异构类型,因此整合各种来源的信息仍然是系统生物学的一个挑战,需要开发通用方法来考虑它们的不同特性。

结果:我们提出了一个多内核框架,允许将各种类型的多个数据集集成到单个探索性分析中。提供了几种解决方案来学习一致的元核或保留数据集原始拓扑的元核。我们将我们的框架应用于分析两个公开的多组学数据集。首先,探索了在 TARA 海洋考察期间收集的多个宏基因组数据集,以证明我们的方法不仅能够在单个核 PCA 中检索以前的发现,而且当分析中包含更多数据集时,还能够提供样本结构的新图像。为了执行此分析,还提出了一种通用程序来提高核 PCA 与原始数据的可解释性。其次,使用核自组织映射分析了来自癌症基因组图谱的多组学乳腺癌数据集,使用了单组学和多组学策略。这两种方法的比较证明了我们的集成方法能够改善所研究的生物系统的表示能力。

可用性和实现:所提出的方法可在 R 包 mixKernel 中使用,该包已在 CRAN 上发布。它与 mixOmics 包完全兼容,有关该方法的教程可在 mixOmics 网站 http://mixomics.org/mixkernel/ 上找到。

联系方式:jerome.mariette@inra.fr 或 nathalie.villa-vialaneix@inra.fr。

补充信息:补充数据可在生物信息学在线获得。

相似文献

[1]
Unsupervised multiple kernel learning for heterogeneous data integration.

Bioinformatics. 2018-3-15

[2]
Feature selection for kernel methods in systems biology.

NAR Genom Bioinform. 2022-3-7

[3]
integrOmics: an R package to unravel relationships between two omics datasets.

Bioinformatics. 2009-8-25

[4]
DeepDRK: a deep learning framework for drug repurposing through kernel-based multi-omics integration.

Brief Bioinform. 2021-9-2

[5]
Fast and interpretable genomic data analysis using multiple approximate kernel learning.

Bioinformatics. 2022-6-24

[6]
Deviance residuals-based sparse PLS and sparse kernel PLS regression for censored data.

Bioinformatics. 2014-10-6

[7]
Clustering and variable selection evaluation of 13 unsupervised methods for multi-omics data integration.

Brief Bioinform. 2020-12-1

[8]
mixOmics: An R package for 'omics feature selection and multiple data integration.

PLoS Comput Biol. 2017-11-3

[9]
A powerful framework for an integrative study with heterogeneous omics data: from univariate statistics to multi-block analysis.

Brief Bioinform. 2021-5-20

[10]
Kernel-PCA data integration with enhanced interpretability.

BMC Syst Biol. 2014

引用本文的文献

[1]
Effective integration of multi-omics with prior knowledge to identify biomarkers via explainable graph neural networks.

NPJ Syst Biol Appl. 2025-5-8

[2]
Impact of foliar application of phyllosphere yeast strains combined with soil fertilizer application on rice growth and yield.

Environ Microbiome. 2024-12-18

[3]
Deep learning-based approaches for multi-omics data integration and analysis.

BioData Min. 2024-10-2

[4]
An explainable graph neural network approach for effectively integrating multi-omics with prior knowledge to identify biomarkers from interacting biological domains.

bioRxiv. 2024-9-26

[5]
Application of Mass Cytometry Platforms to Solid Organ Transplantation.

Transplantation. 2024-10-1

[6]
MEMMAL: A tool for expanding large-scale mechanistic models with machine learned associations and big datasets.

Front Syst Biol. 2023

[7]
A toolbox of machine learning software to support microbiome analysis.

Front Microbiol. 2023-11-22

[8]
Imaging and multi-omics datasets converge to define different neural progenitor origins for ATRT-SHH subgroups.

Nat Commun. 2023-10-20

[9]
Asterics: a simple tool for the ExploRation and Integration of omiCS data.

BMC Bioinformatics. 2023-10-18

[10]
Improvement of variables interpretability in kernel PCA.

BMC Bioinformatics. 2023-7-12

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索