Suppr超能文献

调控子空间富集分析:一种用于更好描述细胞对 SARS-CoV-2 反应的新方法。

Enrichment analysis on regulatory subspaces: A novel direction for the superior description of cellular responses to SARS-CoV-2.

机构信息

IDMEC, Instituto Superior Tecnico, Universidade de Lisboa, Lisbon, Portugal; INESC-ID and Instituto Superior Tecnico, Universidade de Lisboa, Lisbon, Portugal.

IDMEC, Instituto Superior Tecnico, Universidade de Lisboa, Lisbon, Portugal; LAQV-REQUIMTE, DQ, NOVA School of Science and Technology, Caparica, Portugal.

出版信息

Comput Biol Med. 2022 Jul;146:105443. doi: 10.1016/j.compbiomed.2022.105443. Epub 2022 Apr 25.

Abstract

STATEMENT

Enrichment analysis of cell transcriptional responses to SARS-CoV-2 infection from biclustering solutions yields broader coverage and superior enrichment of GO terms and KEGG pathways against alternative state-of-the-art machine learning solutions, thus aiding knowledge extraction.

MOTIVATION AND METHODS

The comprehensive understanding of the impacts of SARS-CoV-2 virus on infected cells is still incomplete. This work aims at comparing the role of state-of-the-art machine learning approaches in the study of cell regulatory processes affected and induced by the SARS-CoV-2 virus using transcriptomic data from both infectable cell lines available in public databases and in vivo samples. In particular, we assess the relevance of clustering, biclustering and predictive modeling methods for functional enrichment. Statistical principles to handle scarcity of observations, high data dimensionality, and complex gene interactions are further discussed. In particular, and without loos of generalization ability, the proposed methods are applied to study the differential regulatory response of lung cell lines to SARS-CoV-2 (α-variant) against RSV, IAV (H1N1), and HPIV3 viruses.

RESULTS

Gathered results show that, although clustering and predictive algorithms aid classic stances to functional enrichment analysis, more recent pattern-based biclustering algorithms significantly improve the number and quality of enriched GO terms and KEGG pathways with controlled false positive risks. Additionally, a comparative analysis of these results is performed to identify potential pathophysiological characteristics of COVID-19. These are further compared to those identified by other authors for the same virus as well as related ones such as SARS-CoV-1. The findings are particularly relevant given the lack of other works utilizing more complex machine learning algorithms within this context.

摘要

声明

从细胞对 SARS-CoV-2 感染的转录反应的二分聚类解决方案中进行富集分析,可提供更广泛的覆盖范围,并优于针对替代最先进的机器学习解决方案的 GO 术语和 KEGG 途径的富集,从而有助于提取知识。

动机和方法

全面了解 SARS-CoV-2 病毒对感染细胞的影响仍然不完整。本工作旨在比较最先进的机器学习方法在使用公共数据库中可感染细胞系和体内样本的转录组数据研究受 SARS-CoV-2 病毒影响和诱导的细胞调控过程中的作用。特别是,我们评估了聚类、二分聚类和预测建模方法在功能富集方面的相关性。进一步讨论了处理观察值稀缺、高数据维度和复杂基因相互作用的统计原理。特别是,在不降低泛化能力的情况下,将提出的方法应用于研究肺细胞系对 SARS-CoV-2(α变体)与 RSV、IAV(H1N1)和 HPIV3 病毒的差异调节反应。

结果

收集的结果表明,尽管聚类和预测算法有助于经典的功能富集分析方法,但最近的基于模式的二分聚类算法可以显著提高富集的 GO 术语和 KEGG 途径的数量和质量,同时控制假阳性风险。此外,还对这些结果进行了比较分析,以确定 COVID-19 的潜在病理生理特征。与其他作者针对同一病毒以及 SARS-CoV-1 等相关病毒进行的分析相比,这些特征具有重要意义。鉴于在这种情况下缺乏利用更复杂机器学习算法的其他工作,因此这些发现特别重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41a5/9040465/dbf176e33aef/gr1_lrg.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验