Zhang Youyi, Morris Jeffrey S, Aerry Shivali Narang, Rao Arvind U K, Baladandayuthapani Veerabhadran
The University of Texas MD Anderson Cancer Center.
Johns Hopkins University.
Ann Appl Stat. 2019;13(3):1957-1988. doi: 10.1214/19-aoas1238. Epub 2019 Oct 17.
Technological innovations have produced large multi-modal datasets that include imaging and multi-platform genomics data. Integrative analyses of such data have the potential to reveal important biological and clinical insights into complex diseases like cancer. In this paper, we present Bayesian approaches for integrative analysis of radiological imaging and multi-platform genomic data, wherein our goals are to simultaneously identify genomic and radiomic, i.e., radiology-based imaging markers, along with the latent associations between these two modalities, and to detect the overall prognostic relevance of the combined markers. For this task, we propose , a multi-scale Bayesian hierarchical model that involves several innovative strategies: it incorporates integrative analysis of multi-platform genomic data sets to capture fundamental biological relationships; explores the associations between radiomic markers accompanying genomic information with clinical outcomes; and detects genomic and radiomic markers associated with clinical prognosis. We also introduce the use of sparse Principal Component Analysis (sPCA) to extract a sparse set of approximately orthogonal meta-features each containing information from a set of related individual radiomic features, reducing dimensionality and combining like features. Our methods are motivated by and applied to The Cancer Genome Atlas glioblastoma multiforme data set, where-in we integrate magnetic resonance imaging-based biomarkers along with genomic, epigenomic and transcriptomic data. Our model identifies important magnetic resonance imaging features and the associated genomic platforms that are related with patient survival times.
技术创新产生了大型多模态数据集,其中包括成像数据和多平台基因组数据。对此类数据进行综合分析,有可能揭示对癌症等复杂疾病的重要生物学和临床见解。在本文中,我们提出了用于放射成像和多平台基因组数据综合分析的贝叶斯方法,我们的目标是同时识别基因组和放射组学(即基于放射学的成像标记),以及这两种模态之间的潜在关联,并检测组合标记的总体预后相关性。对于这项任务,我们提出了一种多尺度贝叶斯层次模型,该模型涉及几种创新策略:它纳入了多平台基因组数据集的综合分析,以捕捉基本的生物学关系;探索伴随基因组信息的放射组学标记与临床结果之间的关联;并检测与临床预后相关的基因组和放射组学标记。我们还引入了稀疏主成分分析(sPCA)的方法,以提取一组稀疏的近似正交元特征,每个元特征都包含来自一组相关个体放射组学特征的信息,可以降低维度并合并相似特征。我们的方法受到癌症基因组图谱多形性胶质母细胞瘤数据集的启发并应用于该数据集,在该数据集中,我们整合了基于磁共振成像的生物标志物以及基因组、表观基因组和转录组数据。我们的模型识别出了与患者生存时间相关的重要磁共振成像特征以及相关的基因组平台。