Suppr超能文献

利用 Aklimate 进行准确的癌症表型预测,Aklimate 是一种集成多模态基因组数据和通路知识的堆叠核学习器。

Accurate cancer phenotype prediction with AKLIMATE, a stacked kernel learner integrating multimodal genomic data and pathway knowledge.

机构信息

Department of Biomolecular Engineering, University of California, Santa Cruz, California, United States of America.

出版信息

PLoS Comput Biol. 2021 Apr 16;17(4):e1008878. doi: 10.1371/journal.pcbi.1008878. eCollection 2021 Apr.

Abstract

Advancements in sequencing have led to the proliferation of multi-omic profiles of human cells under different conditions and perturbations. In addition, many databases have amassed information about pathways and gene "signatures"-patterns of gene expression associated with specific cellular and phenotypic contexts. An important current challenge in systems biology is to leverage such knowledge about gene coordination to maximize the predictive power and generalization of models applied to high-throughput datasets. However, few such integrative approaches exist that also provide interpretable results quantifying the importance of individual genes and pathways to model accuracy. We introduce AKLIMATE, a first kernel-based stacked learner that seamlessly incorporates multi-omics feature data with prior information in the form of pathways for either regression or classification tasks. AKLIMATE uses a novel multiple-kernel learning framework where individual kernels capture the prediction propensities recorded in random forests, each built from a specific pathway gene set that integrates all omics data for its member genes. AKLIMATE has comparable or improved performance relative to state-of-the-art methods on diverse phenotype learning tasks, including predicting microsatellite instability in endometrial and colorectal cancer, survival in breast cancer, and cell line response to gene knockdowns. We show how AKLIMATE is able to connect feature data across data platforms through their common pathways to identify examples of several known and novel contributors of cancer and synthetic lethality.

摘要

测序技术的进步使得在不同条件和干扰下对人类细胞的多组学特征进行了大量研究。此外,许多数据库已经积累了有关途径和基因“特征”(与特定细胞和表型背景相关的基因表达模式)的信息。系统生物学中的一个重要当前挑战是利用这些关于基因协调的知识,最大限度地提高应用于高通量数据集的模型的预测能力和泛化能力。然而,很少有这样的综合方法存在,这些方法还提供了可解释的结果,量化了个体基因和途径对模型准确性的重要性。我们引入了 Aklimate,这是一种基于核的堆叠学习者,它可以无缝地将多组学特征数据与以途径形式提供的先验信息结合起来,用于回归或分类任务。Aklimate 使用了一种新颖的多核学习框架,其中每个核都捕获了随机森林中记录的预测倾向,每个核都是从特定的途径基因集构建的,该途径基因集集成了其成员基因的所有组学数据。Aklimate 在各种表型学习任务中的性能与最先进的方法相当或有所提高,包括预测子宫内膜癌和结直肠癌中的微卫星不稳定性、乳腺癌的生存情况以及基因敲低对细胞系的反应。我们展示了 Aklimate 如何通过它们的共同途径将不同数据平台的特征数据联系起来,以识别几种已知和新型癌症和合成致死性贡献者的例子。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6770/8081343/134b61aba556/pcbi.1008878.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验