• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

MOGSA:整合多个组学数据的单样本基因集分析。

MOGSA: Integrative Single Sample Gene-set Analysis of Multiple Omics Data.

机构信息

Chair of Proteomics and Bioanalytics, Technische Universität München, Freising, Germany; Bavarian Biomolecular Mass Spectrometry Center (BayBioMS), TUM, Freising, Germany.

Department of Data Science, Division of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, Massachusetts 02215.

出版信息

Mol Cell Proteomics. 2019 Aug 9;18(8 suppl 1):S153-S168. doi: 10.1074/mcp.TIR118.001251. Epub 2019 Jun 26.

DOI:10.1074/mcp.TIR118.001251
PMID:31243065
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6692785/
Abstract

Gene-set analysis (GSA) summarizes individual molecular measurements to more interpretable pathways or gene-sets and has become an indispensable step in the interpretation of large-scale omics data. However, GSA methods are limited to the analysis of single omics data. Here, we introduce a new computation method termed multi-omics gene-set analysis (MOGSA), a multivariate single sample gene-set analysis method that integrates multiple experimental and molecular data types measured over the same set of samples. The method learns a low dimensional representation of most variant correlated features (genes, proteins, etc.) across multiple omics data sets, transforms the features onto the same scale and calculates an integrated gene-set score from the most informative features in each data type. MOGSA does not require filtering data to the intersection of features (gene IDs), therefore, all molecular features, including those that lack annotation may be included in the analysis. Using simulated data, we demonstrate that integrating multiple diverse sources of molecular data increases the power to discover subtle changes in gene-sets and may reduce the impact of unreliable information in any single data type. Using real experimental data, we demonstrate three use-cases of MOGSA. First, we show how to remove a source of noise (technical or biological) in integrative MOGSA of NCI60 transcriptome and proteome data. Second, we apply MOGSA to discover similarities and differences in mRNA, protein and phosphorylation profiles of a small study of stem cell lines and assess the influence of each data type or feature on the total gene-set score. Finally, we apply MOGSA to cluster analysis and show that three molecular subtypes are robustly discovered when copy number variation and mRNA data of 308 bladder cancers from The Cancer Genome Atlas are integrated using MOGSA. MOGSA is available in the Bioconductor R package "mogsa."

摘要

基因集分析(GSA)将个体分子测量结果汇总到更具可解释性的途径或基因集中,已成为解释大规模组学数据不可或缺的步骤。然而,GSA 方法仅限于单一组学数据的分析。在这里,我们引入了一种新的计算方法,称为多组学基因集分析(MOGSA),这是一种多元单样本基因集分析方法,可整合在同一组样本上测量的多个实验和分子数据类型。该方法学习了多个组学数据集中大多数与变体相关的特征(基因、蛋白质等)的低维表示形式,将特征转换到同一尺度,并从每个数据类型中最具信息量的特征计算综合基因集得分。MOGSA 不需要将数据过滤到特征(基因 ID)的交集,因此,所有分子特征,包括那些缺乏注释的特征,都可以包含在分析中。使用模拟数据,我们证明了整合多种不同来源的分子数据可以提高发现基因集细微变化的能力,并可能减少任何单一数据类型中不可靠信息的影响。使用真实的实验数据,我们展示了 MOGSA 的三个用例。首先,我们展示了如何在 NCI60 转录组和蛋白质组数据的集成 MOGSA 中去除噪声源(技术或生物学)。其次,我们应用 MOGSA 来发现干细胞系小研究中的 mRNA、蛋白质和磷酸化谱的相似性和差异,并评估每种数据类型或特征对总基因集得分的影响。最后,我们应用 MOGSA 进行聚类分析,并展示了当整合来自癌症基因组图谱的 308 例膀胱癌的拷贝数变异和 mRNA 数据时,三个分子亚型是如何稳健地被发现的。MOGSA 可在 Bioconductor R 包“mogsa”中使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c0/6692785/4c67e301e03a/zjw0141959940006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c0/6692785/9c0efd948293/zjw0141959940007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c0/6692785/e05c85d4a1a3/zjw0141959940001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c0/6692785/a1a041567c9d/zjw0141959940002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c0/6692785/77722026a86a/zjw0141959940003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c0/6692785/d8d762cab751/zjw0141959940004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c0/6692785/39868dad39f0/zjw0141959940005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c0/6692785/4c67e301e03a/zjw0141959940006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c0/6692785/9c0efd948293/zjw0141959940007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c0/6692785/e05c85d4a1a3/zjw0141959940001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c0/6692785/a1a041567c9d/zjw0141959940002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c0/6692785/77722026a86a/zjw0141959940003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c0/6692785/d8d762cab751/zjw0141959940004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c0/6692785/39868dad39f0/zjw0141959940005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c0/6692785/4c67e301e03a/zjw0141959940006.jpg

相似文献

1
MOGSA: Integrative Single Sample Gene-set Analysis of Multiple Omics Data.MOGSA:整合多个组学数据的单样本基因集分析。
Mol Cell Proteomics. 2019 Aug 9;18(8 suppl 1):S153-S168. doi: 10.1074/mcp.TIR118.001251. Epub 2019 Jun 26.
2
A multivariate approach to the integration of multi-omics datasets.一种整合多组学数据集的多变量方法。
BMC Bioinformatics. 2014 May 29;15:162. doi: 10.1186/1471-2105-15-162.
3
Graph Algorithms for Condensing and Consolidating Gene Set Analysis Results.用于凝聚和整合基因集分析结果的图算法。
Mol Cell Proteomics. 2019 Aug 9;18(8 suppl 1):S141-S152. doi: 10.1074/mcp.TIR118.001263. Epub 2019 May 29.
4
Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification头部损伤的转化代谢组学:基于体外核磁共振波谱的代谢物定量分析探索脑代谢功能障碍
5
Targeted DNA and RNA Sequencing of Paired Urothelial and Squamous Bladder Cancers Reveals Discordant Genomic and Transcriptomic Events and Unique Therapeutic Implications.靶向 DNA 和 RNA 测序联合分析尿路上皮癌和鳞状细胞膀胱癌揭示了不同的基因组和转录组事件以及独特的治疗意义。
Eur Urol. 2018 Dec;74(6):741-753. doi: 10.1016/j.eururo.2018.06.047. Epub 2018 Jul 20.
6
Multi-omics Pathways Workflow (MOPAW): An Automated Multi-omics Workflow on the Cancer Genomics Cloud.多组学通路工作流程(MOPAW):癌症基因组学云上的自动化多组学工作流程。
Cancer Inform. 2023 Jun 16;22:11769351231180992. doi: 10.1177/11769351231180992. eCollection 2023.
7
Estimating gene expression from DNA methylation and copy number variation: A deep learning regression model for multi-omics integration.从 DNA 甲基化和拷贝数变异估计基因表达:一种用于多组学整合的深度学习回归模型。
Genomics. 2020 Jul;112(4):2833-2841. doi: 10.1016/j.ygeno.2020.03.021. Epub 2020 Mar 29.
8
Integrative Exploratory Analysis of Two or More Genomic Datasets.两个或多个基因组数据集的综合探索性分析
Methods Mol Biol. 2016;1418:19-38. doi: 10.1007/978-1-4939-3578-9_2.
9
A multi-omics data simulator for complex disease studies and its application to evaluate multi-omics data analysis methods for disease classification.用于复杂疾病研究的多组学数据模拟器及其在评估疾病分类的多组学数据分析方法中的应用。
Gigascience. 2019 May 1;8(5). doi: 10.1093/gigascience/giz045.
10
Multi-Omics analysis identifies a lncRNA-related prognostic signature to predict bladder cancer recurrence.多组学分析确定一个 lncRNA 相关的预后特征,以预测膀胱癌复发。
Bioengineered. 2021 Dec;12(2):11108-11125. doi: 10.1080/21655979.2021.2000122.

引用本文的文献

1
A review on multi-omics integration for aiding study design of large scale TCGA cancer datasets.关于多组学整合以辅助大规模TCGA癌症数据集研究设计的综述。
BMC Genomics. 2025 Aug 22;26(1):769. doi: 10.1186/s12864-025-11925-y.
2
Proteogenomic characterization of invasive breast tumors in young women.年轻女性浸润性乳腺癌的蛋白质基因组特征分析
NPJ Breast Cancer. 2025 Aug 18;11(1):94. doi: 10.1038/s41523-025-00793-0.
3
Informatics at the Frontier of Cancer Research.癌症研究前沿的信息学

本文引用的文献

1
Evaluation of integrative clustering methods for the analysis of multi-omics data.评估整合聚类方法在多组学数据分析中的应用。
Brief Bioinform. 2020 Mar 23;21(2):541-552. doi: 10.1093/bib/bbz015.
2
A Curated Resource for Phosphosite-specific Signature Analysis.磷酸化特异性特征分析的精选资源。
Mol Cell Proteomics. 2019 Mar;18(3):576-593. doi: 10.1074/mcp.TIR118.000943. Epub 2018 Dec 18.
3
Multi-omic and multi-view clustering algorithms: review and cancer benchmark.多组学和多视角聚类算法:综述和癌症基准测试。
Cancer Res. 2025 Aug 15;85(16):2967-2986. doi: 10.1158/0008-5472.CAN-24-2829.
4
Multimodal fusion of radio-pathology and proteogenomics identify integrated glioma subtypes with prognostic and therapeutic opportunities.放射病理学与蛋白质基因组学的多模态融合确定了具有预后和治疗机会的综合胶质瘤亚型。
Nat Commun. 2025 Apr 13;16(1):3510. doi: 10.1038/s41467-025-58675-9.
5
nipalsMCIA: flexible multi-block dimensionality reduction in R via nonlinear iterative partial least squares.nipalsMCIA:通过非线性迭代偏最小二乘法在R语言中实现灵活的多块降维
Bioinformatics. 2024 Dec 26;41(1). doi: 10.1093/bioinformatics/btaf015.
6
From Omics to Multi-Omics: A Review of Advantages and Tradeoffs.从组学到多组学:优势与权衡综述
Genes (Basel). 2024 Nov 29;15(12):1551. doi: 10.3390/genes15121551.
7
Enhancing immune response and survival in hepatocellular carcinoma with novel oncolytic Jurona virus and immune checkpoint blockade.利用新型溶瘤朱罗纳病毒和免疫检查点阻断增强肝细胞癌的免疫反应和生存率。
Mol Ther Oncol. 2024 Nov 26;32(4):200913. doi: 10.1016/j.omton.2024.200913. eCollection 2024 Dec 19.
8
Methods for multi-omic data integration in cancer research.癌症研究中的多组学数据整合方法。
Front Genet. 2024 Sep 19;15:1425456. doi: 10.3389/fgene.2024.1425456. eCollection 2024.
9
nipalsMCIA: Flexible Multi-Block Dimensionality Reduction in R via Non-linear Iterative Partial Least Squares.nipalsMCIA:通过非线性迭代偏最小二乘法在R中实现灵活的多块降维
bioRxiv. 2024 Jun 10:2024.06.07.597819. doi: 10.1101/2024.06.07.597819.
10
Bioinformatics Analysis and Validation of Potential Markers Associated with Prediction and Prognosis of Gastric Cancer.生物信息学分析和验证与胃癌预测和预后相关的潜在标志物。
Int J Mol Sci. 2024 May 28;25(11):5880. doi: 10.3390/ijms25115880.
Nucleic Acids Res. 2018 Nov 16;46(20):10546-10562. doi: 10.1093/nar/gky889.
4
Enter the Matrix: Factorization Uncovers Knowledge from Omics.《进入矩阵:从组学中发现知识的因子分解》
Trends Genet. 2018 Oct;34(10):790-805. doi: 10.1016/j.tig.2018.07.003. Epub 2018 Aug 22.
5
Multi-Omics Factor Analysis-a framework for unsupervised integration of multi-omics data sets.多组学因子分析——一种用于无监督整合多组学数据集的框架。
Mol Syst Biol. 2018 Jun 20;14(6):e8124. doi: 10.15252/msb.20178124.
6
Unsupervised multiple kernel learning for heterogeneous data integration.无监督多内核学习在异类数据集成中的应用。
Bioinformatics. 2018 Mar 15;34(6):1009-1015. doi: 10.1093/bioinformatics/btx682.
7
Comprehensive Molecular Characterization of Muscle-Invasive Bladder Cancer.肌层浸润性膀胱癌的综合分子特征分析
Cell. 2017 Oct 19;171(3):540-556.e25. doi: 10.1016/j.cell.2017.09.007. Epub 2017 Oct 5.
8
Multiplexed quantification of proteins and transcripts in single cells.单细胞中蛋白质和转录本的多重定量分析。
Nat Biotechnol. 2017 Oct;35(10):936-939. doi: 10.1038/nbt.3973. Epub 2017 Aug 30.
9
Simultaneous epitope and transcriptome measurement in single cells.单细胞中表位和转录组的同步测量。
Nat Methods. 2017 Sep;14(9):865-868. doi: 10.1038/nmeth.4380. Epub 2017 Jul 31.
10
Regularized Generalized Canonical Correlation Analysis: A Framework for Sequential Multiblock Component Methods.正则化广义典型相关分析:一种用于顺序多块成分方法的框架。
Psychometrika. 2017 May 23. doi: 10.1007/s11336-017-9573-x.