MultiPLIER：一种转录组学的迁移学习框架，揭示了罕见病的系统特征。

MultiPLIER: A Transfer Learning Framework for Transcriptomics Reveals Systemic Features of Rare Disease.

机构信息

Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, Philadelphia, PA, USA; Childhood Cancer Data Laboratory, Alex's Lemonade Stand Foundation, Philadelphia, PA, USA.

National Institute of Arthritis and Musculoskeletal and Skin Diseases, National Institutes of Health, Bethesda, MD, USA.

出版信息

Cell Syst. 2019 May 22;8(5):380-394.e4. doi: 10.1016/j.cels.2019.04.003.

DOI:10.1016/j.cels.2019.04.003

PMID:31121115

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6538307/

Abstract

Most gene expression datasets generated by individual researchers are too small to fully benefit from unsupervised machine-learning methods. In the case of rare diseases, there may be too few cases available, even when multiple studies are combined. To address this challenge, we utilize transfer learning to extract coordinated expression patterns and use learned patterns to analyze small rare disease datasets. We trained a pathway-level information extractor (PLIER) model on a large public data compendium comprising multiple experiments, tissues, and biological conditions and then transferred the model to small datasets in an approach we call MultiPLIER. Models constructed from the public data compendium included features that aligned well to known biological factors and were more comprehensive than those constructed from individual datasets or conditions. When transferred to rare disease datasets, the models describe biological processes related to disease severity more effectively than models trained only on a given dataset.

摘要

大多数由单个研究人员生成的基因表达数据集都太小，无法充分受益于无监督机器学习方法。在罕见疾病的情况下，即使将多个研究结合起来，也可能只有很少的病例。为了解决这个挑战，我们利用迁移学习来提取协调的表达模式，并使用学习到的模式来分析小型罕见疾病数据集。我们在一个包含多个实验、组织和生物条件的大型公共数据汇编上训练了一个通路级信息提取器 (PLIER) 模型，然后将该模型转移到一个我们称之为 MultiPLIER 的小数据集上。从公共数据汇编中构建的模型包含与已知生物学因素很好对齐的特征，并且比从单个数据集或条件构建的模型更全面。当转移到罕见疾病数据集时，这些模型比仅在给定数据集上训练的模型更有效地描述与疾病严重程度相关的生物学过程。

相似文献

MultiPLIER: A Transfer Learning Framework for Transcriptomics Reveals Systemic Features of Rare Disease.

Cell Syst. 2019 May 22;8(5):380-394.e4. doi: 10.1016/j.cels.2019.04.003.

Evaluation of Taroni et al.: Understanding Rare Diseases by MultiPLIER.

Cell Syst. 2019 May 22;8(5):359-360. doi: 10.1016/j.cels.2019.05.001.

MousiPLIER: A Mouse Pathway-Level Information Extractor Model.

eNeuro. 2024 Jun 5;11(6). doi: 10.1523/ENEURO.0313-23.2024. Print 2024 Jun.

Exploring combinations of dimensionality reduction, transfer learning, and regularization methods for predicting binary phenotypes with transcriptomic data.

BMC Bioinformatics. 2024 Apr 26;25(1):167. doi: 10.1186/s12859-024-05795-6.

spatiAlign: an unsupervised contrastive learning model for data integration of spatially resolved transcriptomics.

Gigascience. 2024 Jan 2;13. doi: 10.1093/gigascience/giae042.

Machine learning analysis of gene expression data reveals novel diagnostic and prognostic biomarkers and identifies therapeutic targets for soft tissue sarcomas.

PLoS Comput Biol. 2019 Feb 20;15(2):e1006826. doi: 10.1371/journal.pcbi.1006826. eCollection 2019 Feb.

Machine learning and related approaches in transcriptomics.

Biochem Biophys Res Commun. 2024 Sep 10;724:150225. doi: 10.1016/j.bbrc.2024.150225. Epub 2024 Jun 4.

HetEnc: a deep learning predictive model for multi-type biological dataset.

BMC Genomics. 2019 Aug 8;20(1):638. doi: 10.1186/s12864-019-5997-2.

Machine learning in rare disease.

Nat Methods. 2023 Jun;20(6):803-814. doi: 10.1038/s41592-023-01886-z. Epub 2023 May 29.

Exploring Genome-Wide Expression Profiles Using Machine Learning Techniques.

Methods Mol Biol. 2017;1537:347-364. doi: 10.1007/978-1-4939-6685-1_20.

引用本文的文献

MOTL: enhancing multi-omics matrix factorization with transfer learning.

Genome Biol. 2025 Jul 25;26(1):224. doi: 10.1186/s13059-025-03675-7.

Integrating Artificial Intelligence in Next-Generation Sequencing: Advances, Challenges, and Future Directions.

Curr Issues Mol Biol. 2025 Jun 19;47(6):470. doi: 10.3390/cimb47060470.

GRACKLE: an interpretable matrix factorization approach for biomedical representation learning.

Bioinformatics. 2025 Jul 1;41(Supplement_1):i609-i618. doi: 10.1093/bioinformatics/btaf213.

Prediction of Cerebrospinal Fluid (CSF) Pressure with Generative Adversarial Network Synthetic Plasma-CSF Biomarker Pairing.

Neuroinformatics. 2025 Jul 10;23(3):38. doi: 10.1007/s12021-025-09729-2.

Exploring unsupervised feature extraction algorithms: tackling high dimensionality in small datasets.

Sci Rep. 2025 Jul 1;15(1):21973. doi: 10.1038/s41598-025-07725-9.

PLIERv2: bigger, better and faster.

bioRxiv. 2025 Jun 8:2025.06.05.658122. doi: 10.1101/2025.06.05.658122.

Can AI reveal the next generation of high-impact bone genomics targets?

Bone Rep. 2025 Mar 24;25:101839. doi: 10.1016/j.bonr.2025.101839. eCollection 2025 Jun.

Translational approaches to the study of eosinophils in vasculitis.

Rheumatology (Oxford). 2025 Mar 1;64(Supplement_1):i19-i23. doi: 10.1093/rheumatology/keaf005.

Genetic Studies Through the Lens of Gene Networks.

Annu Rev Biomed Data Sci. 2025 Feb 20. doi: 10.1146/annurev-biodatasci-103123-095355.

Recent Development, Applications, and Patents of Artificial Intelligence in Drug Design and Development.

Curr Drug Discov Technol. 2025 Feb 10. doi: 10.2174/0115701638364199250123062248.

本文引用的文献

Pathway-level information extractor (PLIER) for gene expression data.

Nat Methods. 2019 Jul;16(7):607-610. doi: 10.1038/s41592-019-0456-1. Epub 2019 Jun 27.

Decomposing Cell Identity for Transfer Learning across Cellular Measurements, Platforms, Tissues, and Species.

Cell Syst. 2019 May 22;8(5):395-411.e8. doi: 10.1016/j.cels.2019.04.004.

Enter the Matrix: Factorization Uncovers Knowledge from Omics.

Trends Genet. 2018 Oct;34(10):790-805. doi: 10.1016/j.tig.2018.07.003. Epub 2018 Aug 22.

Metabolic pathways and immunometabolism in rare kidney diseases.

Ann Rheum Dis. 2018 Aug;77(8):1226-1233. doi: 10.1136/annrheumdis-2017-212935. Epub 2018 May 3.

Extracting a biologically relevant latent space from cancer transcriptomes with variational autoencoders.

Pac Symp Biocomput. 2018;23:80-91.

Cell-specific prediction and application of drug-induced gene expression profiles.

Pac Symp Biocomput. 2018;23:32-43.

A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles.

Cell. 2017 Nov 30;171(6):1437-1452.e17. doi: 10.1016/j.cell.2017.10.049.

Efficient Generation of Transcriptomic Profiles by Random Composite Measurements.

Cell. 2017 Nov 30;171(6):1424-1436.e18. doi: 10.1016/j.cell.2017.10.023. Epub 2017 Nov 16.

The Reactome Pathway Knowledgebase.

Nucleic Acids Res. 2018 Jan 4;46(D1):D649-D655. doi: 10.1093/nar/gkx1132.

Unsupervised Extraction of Stable Expression Signatures from Public Compendia with an Ensemble of Neural Networks.

Cell Syst. 2017 Jul 26;5(1):63-71.e6. doi: 10.1016/j.cels.2017.06.003. Epub 2017 Jul 12.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

MultiPLIER：一种转录组学的迁移学习框架，揭示了罕见病的系统特征。

MultiPLIER: A Transfer Learning Framework for Transcriptomics Reveals Systemic Features of Rare Disease.

机构信息

Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, Philadelphia, PA, USA; Childhood Cancer Data Laboratory, Alex's Lemonade Stand Foundation, Philadelphia, PA, USA.

National Institute of Arthritis and Musculoskeletal and Skin Diseases, National Institutes of Health, Bethesda, MD, USA.

出版信息

Cell Syst. 2019 May 22;8(5):380-394.e4. doi: 10.1016/j.cels.2019.04.003.

DOI:10.1016/j.cels.2019.04.003

PMID:31121115

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6538307/

Abstract

摘要

MultiPLIER：一种转录组学的迁移学习框架，揭示了罕见病的系统特征。

MultiPLIER: A Transfer Learning Framework for Transcriptomics Reveals Systemic Features of Rare Disease.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

MultiPLIER：一种转录组学的迁移学习框架，揭示了罕见病的系统特征。

MultiPLIER: A Transfer Learning Framework for Transcriptomics Reveals Systemic Features of Rare Disease.

机构信息

出版信息