单样本分子表型评分。

Single sample scoring of molecular phenotypes.

机构信息

University of Melbourne Department of Surgery, St. Vincent's Hospital, Melbourne, VIC, 3065, Australia.

Division of Bioinformatics, Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, 3051, Australia.

出版信息

BMC Bioinformatics. 2018 Nov 6;19(1):404. doi: 10.1186/s12859-018-2435-4.

DOI:10.1186/s12859-018-2435-4

PMID:30400809

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6219008/

Abstract

BACKGROUND

Gene set scoring provides a useful approach for quantifying concordance between sample transcriptomes and selected molecular signatures. Most methods use information from all samples to score an individual sample, leading to unstable scores in small data sets and introducing biases from sample composition (e.g. varying numbers of samples for different cancer subtypes). To address these issues, we have developed a truly single sample scoring method, and associated R/Bioconductor package singscore ( https://bioconductor.org/packages/singscore ).

RESULTS

We use multiple cancer data sets to compare singscore against widely-used methods, including GSVA, z-score, PLAGE, and ssGSEA. Our approach does not depend upon background samples and scores are thus stable regardless of the composition and number of samples being scored. In contrast, scores obtained by GSVA, z-score, PLAGE and ssGSEA can be unstable when less data are available (N < 25). The singscore method performs as well as the best performing methods in terms of power, recall, false positive rate and computational time, and provides consistently high and balanced performance across all these criteria. To enhance the impact and utility of our method, we have also included a set of functions implementing visual analysis and diagnostics to support the exploration of molecular phenotypes in single samples and across populations of data.

CONCLUSIONS

The singscore method described here functions independent of sample composition in gene expression data and thus it provides stable scores, which are particularly useful for small data sets or data integration. Singscore performs well across all performance criteria, and includes a suite of powerful visualization functions to assist in the interpretation of results. This method performs as well as or better than other scoring approaches in terms of its power to distinguish samples with distinct biology and its ability to call true differential gene sets between two conditions. These scores can be used for dimensional reduction of transcriptomic data and the phenotypic landscapes obtained by scoring samples against multiple molecular signatures may provide insights for sample stratification.

摘要

背景

基因集评分提供了一种有用的方法，可以量化样本转录组与选定分子特征之间的一致性。大多数方法使用所有样本的信息来对单个样本进行评分，这导致在小数据集和样本组成（例如不同癌症亚型的样本数量不同）中引入偏差的评分不稳定。为了解决这些问题，我们开发了一种真正的单个样本评分方法，并开发了相关的 R/Bioconductor 包 singscore（https://bioconductor.org/packages/singscore）。

结果

我们使用多个癌症数据集来比较 singscore 与广泛使用的方法，包括 GSVA、z 分数、PLAGE 和 ssGSEA。我们的方法不依赖于背景样本，因此评分是稳定的，与评分样本的组成和数量无关。相比之下，当数据较少（N<25）时，GSVA、z 分数、PLAGE 和 ssGSEA 获得的评分可能不稳定。singscore 方法在功效、召回率、假阳性率和计算时间方面与表现最好的方法一样好，并在所有这些标准上提供一致的高且平衡的性能。为了增强我们方法的影响和实用性，我们还包括了一组实现可视化分析和诊断的功能，以支持在单个样本和数据群体中探索分子表型。

结论

这里描述的 singscore 方法独立于基因表达数据中的样本组成，因此它提供稳定的评分，这对于小数据集或数据集成特别有用。singscore 在所有性能标准上表现良好，并包括一套强大的可视化功能，以帮助解释结果。就区分具有不同生物学特性的样本的能力及其在两种情况下调用真实差异基因集的能力而言，该方法的性能与其他评分方法一样好或更好。这些分数可用于转录组数据的降维，并且对样本与多个分子特征进行评分获得的表型景观可能为样本分层提供见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5398/6219008/803a47eb32a6/12859_2018_2435_Fig1_HTML.jpg

相似文献

Single sample scoring of molecular phenotypes.单样本分子表型评分。

BMC Bioinformatics. 2018 Nov 6;19(1):404. doi: 10.1186/s12859-018-2435-4.

Cross-platform comparison of immune signatures in immunotherapy-treated patients with advanced melanoma using a rank-based scoring approach.采用基于排名的评分方法，对接受免疫治疗的晚期黑色素瘤患者的免疫特征进行跨平台比较。

J Transl Med. 2023 Apr 13;21(1):257. doi: 10.1186/s12967-023-04092-9.

Using singscore to predict mutation status in acute myeloid leukemia from transcriptomic signatures.利用信号得分从转录组特征预测急性髓系白血病中的突变状态。

F1000Res. 2019 Jun 3;8:776. doi: 10.12688/f1000research.19236.3. eCollection 2019.

Comparison of gene set scoring methods for reproducible evaluation of tuberculosis gene signatures.比较基因集评分方法，以实现结核病基因特征的可重现性评估。

BMC Infect Dis. 2024 Jun 20;24(1):610. doi: 10.1186/s12879-024-09457-z.

Comparison of gene set scoring methods for reproducible evaluation of multiple tuberculosis gene signatures.用于多重结核病基因特征可重复性评估的基因集评分方法比较

bioRxiv. 2023 Jan 30:2023.01.19.520627. doi: 10.1101/2023.01.19.520627.

GOexpress: an R/Bioconductor package for the identification and visualisation of robust gene ontology signatures through supervised learning of gene expression data.GOexpress：一个用于通过对基因表达数据进行监督学习来识别和可视化稳健基因本体特征的R/Bioconductor软件包。

BMC Bioinformatics. 2016 Mar 11;17:126. doi: 10.1186/s12859-016-0971-3.

NEArender: an R package for functional interpretation of 'omics' data via network enrichment analysis.NEArender：一个通过网络富集分析对“组学”数据进行功能解释的R包。

BMC Bioinformatics. 2017 Mar 23;18(Suppl 5):118. doi: 10.1186/s12859-017-1534-y.

Stable gene expression for normalisation and single-sample scoring.稳定的基因表达用于归一化和单样本评分。

Nucleic Acids Res. 2020 Nov 4;48(19):e113. doi: 10.1093/nar/gkaa802.

irGSEA: the integration of single-cell rank-based gene set enrichment analysis.irGSEA：单细胞基于排名的基因集富集分析的整合。

Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae243.

Integrating gene set analysis and nonlinear predictive modeling of disease phenotypes using a Bayesian multitask formulation.使用贝叶斯多任务公式整合疾病表型的基因集分析和非线性预测建模。

BMC Bioinformatics. 2016 Dec 13;17(Suppl 16):0. doi: 10.1186/s12859-016-1311-3.

引用本文的文献

Integrated single-cell and clinical transcriptomic analysis identifies blunted glycolytic activation as a hallmark of maladaptive repair in renal ischemia-reperfusion.整合单细胞和临床转录组分析确定糖酵解激活减弱是肾缺血再灌注中适应性修复不良的一个标志。

Ren Fail. 2025 Dec;47(1):2549400. doi: 10.1080/0886022X.2025.2549400. Epub 2025 Aug 28.

Hypoxia-associated genes and metabolic abnormalities in peripheral blood mononuclear cells of type 1 diabetes mellitus patients.1型糖尿病患者外周血单个核细胞中缺氧相关基因与代谢异常

Hereditas. 2025 Aug 21;162(1):168. doi: 10.1186/s41065-025-00537-x.

The FGFR inhibitor pemigatinib overcomes cancer drug resistance to KRAS G12C inhibitors in mesenchymal lung cancer.FGFR抑制剂培米替尼可克服间质性肺癌对KRAS G12C抑制剂的耐药性。

PLoS One. 2025 Aug 11;20(8):e0327588. doi: 10.1371/journal.pone.0327588. eCollection 2025.

ETS1 Orchestrates a Hybrid EMT Program Driving in vivo Metastasis and Immune Evasion.ETS1 精心编排一种混合上皮-间质转化程序，驱动体内转移和免疫逃逸。

bioRxiv. 2025 Jul 21:2025.07.17.665404. doi: 10.1101/2025.07.17.665404.

Integrative multiomics analysis reveals the subtypes and key mechanisms of platinum resistance in gastric cancer: identification of KLF9 as a promising therapeutic target.整合多组学分析揭示胃癌铂耐药的亚型和关键机制：鉴定KLF9为有前景的治疗靶点

J Transl Med. 2025 Aug 7;23(1):877. doi: 10.1186/s12967-025-06725-7.

Computational analyses to reveal the key determinants of the high malignancy level of cholangiocarcinoma.旨在揭示胆管癌高恶性程度关键决定因素的计算分析。

J Transl Int Med. 2025 Jan 10;12(6):602-617. doi: 10.1515/jtim-2024-0033. eCollection 2024 Dec.

Sci Rep. 2025 Jul 22;15(1):26577. doi: 10.1038/s41598-025-07982-8.

IRF4 promotes immune evasion and shapes the tumor microenvironment in Follicular Lymphoma.IRF4促进滤泡性淋巴瘤中的免疫逃逸并塑造肿瘤微环境。

Blood Cancer Discov. 2025 Jul 16. doi: 10.1158/2643-3230.BCD-24-0223.

Secretory IgA dysfunction underlies poor prognosis in -infected colorectal cancer.分泌型IgA功能障碍是感染性结直肠癌预后不良的潜在原因。

Gut Microbes. 2025 Dec;17(1):2528428. doi: 10.1080/19490976.2025.2528428. Epub 2025 Jul 16.

IDENTIFICATION OF DISEASE-SPECIFIC VULNERABILITY STATES AT THE SINGLE-CELL LEVEL.单细胞水平上疾病特异性脆弱状态的识别。

bioRxiv. 2025 Jun 8:2024.12.04.626873. doi: 10.1101/2024.12.04.626873.

本文引用的文献

Combinatorial Targeting by MicroRNAs Co-ordinates Post-transcriptional Control of EMT.miRNAs 通过组合靶向作用协调 EMT 的转录后调控。

Cell Syst. 2018 Jul 25;7(1):77-91.e7. doi: 10.1016/j.cels.2018.05.019. Epub 2018 Jul 11.

PerPAS: Topology-Based Single Sample Pathway Analysis Method.PerPAS：基于拓扑的单样本通路分析方法。

IEEE/ACM Trans Comput Biol Bioinform. 2018 May-Jun;15(3):1022-1027. doi: 10.1109/TCBB.2017.2679745. Epub 2017 Mar 8.

Salmon provides fast and bias-aware quantification of transcript expression.鲑鱼提供快速且无偏倚的转录本表达定量。

Nat Methods. 2017 Apr;14(4):417-419. doi: 10.1038/nmeth.4197. Epub 2017 Mar 6.

A Transcriptional Program for Detecting TGFβ-Induced EMT in Cancer.一种用于检测癌症中转化生长因子β诱导的上皮-间质转化的转录程序。

Mol Cancer Res. 2017 May;15(5):619-631. doi: 10.1158/1541-7786.MCR-16-0313. Epub 2017 Jan 23.

Orchestrating high-throughput genomic analysis with Bioconductor.使用Bioconductor编排高通量基因组分析。

Nat Methods. 2015 Feb;12(2):115-21. doi: 10.1038/nmeth.3252.

Epithelial-mesenchymal transition spectrum quantification and its efficacy in deciphering survival and drug responses of cancer patients.上皮-间质转化谱定量分析及其在解读癌症患者生存情况和药物反应中的作用

EMBO Mol Med. 2014 Oct;6(10):1279-93. doi: 10.15252/emmm.201404208.

voom: Precision weights unlock linear model analysis tools for RNA-seq read counts.voom：精确权重为RNA测序读数计数解锁线性模型分析工具。

Genome Biol. 2014 Feb 3;15(2):R29. doi: 10.1186/gb-2014-15-2-r29.

Modeling precision treatment of breast cancer.乳腺癌精准治疗建模

Genome Biol. 2013;14(10):R110. doi: 10.1186/gb-2013-14-10-r110.

The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote.Subread 比对工具：基于种子投票的快速、准确和可扩展的读段比对。

Nucleic Acids Res. 2013 May 1;41(10):e108. doi: 10.1093/nar/gkt214. Epub 2013 Apr 4.

GSVA: gene set variation analysis for microarray and RNA-seq data.GSVA：用于微阵列和 RNA-seq 数据的基因集变异分析。

BMC Bioinformatics. 2013 Jan 16;14:7. doi: 10.1186/1471-2105-14-7.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

单样本分子表型评分。

Single sample scoring of molecular phenotypes.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献