Gardeux Vincent, Arslan Ahmet D, Achour Ikbel, Ho Tsui-Ting, Beck William T, Lussier Yves A
BMC Med Genomics. 2014;7 Suppl 1(Suppl 1):S1. doi: 10.1186/1755-8794-7-S1-S1. Epub 2014 May 8.
Genome-wide transcriptome profiling generated by microarray and RNA-Seq often provides deregulated genes or pathways applicable only to larger cohort. On the other hand, individualized interpretation of transcriptomes is increasely pursued to improve diagnosis, prognosis, and patient treatment processes. Yet, robust and accurate methods based on a single paired-sample remain an unmet challenge.
"N-of-1-pathways" translates gene expression data profiles into mechanism-level profiles on single pairs of samples (one p-value per geneset). It relies on three principles: i) statistical universe is a single paired sample, which serves as its own control; ii) statistics can be derived from multiple gene expression measures that share common biological mechanisms assimilated to genesets; iii) semantic similarity metric takes into account inter-mechanisms' relationships to better assess commonality and differences, within and cross study-samples (e.g. patients, cell-lines, tissues, etc.), which helps the interpretation of the underpinning biology.
In the context of underpowered experiments, N-of-1-pathways predictions perform better or comparable to those of GSEA and Differentially Expressed Genes enrichment (DEG enrichment), within-and cross-datasets. N-of-1-pathways uncovered concordant PTBP1-dependent mechanisms across datasets (Odds-Ratios≥13, p-values≤1 × 10-5), such as RNA splicing and cell cycle. In addition, it unveils tissue-specific mechanisms of alternatively transcribed PTBP1-dependent genesets. Furthermore, we demonstrate that GSEA and DEG Enrichment preclude accurate analysis on single paired samples.
N-of-1-pathways enables robust and biologically relevant mechanism-level classifiers with small cohorts and one single paired samples that surpasses conventional methods. Further, it identifies unique sample/ patient mechanisms, a requirement for precision medicine.
通过微阵列和RNA测序产生的全基因组转录组分析通常能提供仅适用于较大队列的失调基因或通路。另一方面,人们越来越追求对转录组进行个性化解读,以改善诊断、预后和患者治疗过程。然而,基于单个配对样本的强大而准确的方法仍然是一个未解决的挑战。
“单样本通路分析(N-of-1-pathways)”将基因表达数据概况转化为单对样本的机制水平概况(每个基因集一个p值)。它基于三个原则:i)统计总体是单个配对样本,其自身作为对照;ii)统计数据可从多个共享与基因集同化的共同生物学机制的基因表达测量中得出;iii)语义相似性度量考虑机制间的关系,以更好地评估研究样本内部和交叉样本(如患者、细胞系、组织等)之间的共性和差异,这有助于对基础生物学进行解读。
在样本量不足的实验背景下,单样本通路分析的预测在数据集内部和交叉数据集中的表现优于或等同于基因集富集分析(GSEA)和差异表达基因富集分析(DEG富集)。单样本通路分析在各数据集中发现了一致的PTBP1依赖性机制(优势比≥13,p值≤1×10⁻⁵),如RNA剪接和细胞周期。此外,它揭示了可变转录的PTBP1依赖性基因集的组织特异性机制。此外,我们证明基因集富集分析和差异表达基因富集分析无法对单个配对样本进行准确分析。
单样本通路分析能够利用小队列和单个配对样本构建强大且具有生物学相关性的机制水平分类器,超越了传统方法。此外,它还能识别独特的样本/患者机制,这是精准医学的要求。