Department of Biomedical Informatics, Emory University, Atlanta, GA, USA.
Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA.
J Pathol. 2018 Apr;244(5):512-524. doi: 10.1002/path.5028. Epub 2018 Feb 22.
The Cancer Genome Atlas (TCGA) represents one of several international consortia dedicated to performing comprehensive genomic and epigenomic analyses of selected tumour types to advance our understanding of disease and provide an open-access resource for worldwide cancer research. Thirty-three tumour types (selected by histology or tissue of origin, to include both common and rare diseases), comprising >11 000 specimens, were subjected to DNA sequencing, copy number and methylation analysis, and transcriptomic, proteomic and histological evaluation. Each cancer type was analysed individually to identify tissue-specific alterations, and make correlations across different molecular platforms. The final dataset was then normalized and combined for the PanCancer Initiative, which seeks to identify commonalities across different cancer types or cells of origin/lineage, or within anatomically or morphologically related groups. An important resource generated along with the rich molecular studies is an extensive digital pathology slide archive, composed of frozen section tissue directly related to the tissues analysed as part of TCGA, and representative formalin-fixed paraffin-embedded, haematoxylin and eosin (H&E)-stained diagnostic slides. These H&E image resources have primarily been used to verify diagnoses and histological subtypes with some limited extraction of standard pathological variables such as mitotic activity, grade, and lymphocytic infiltrates. Largely overlooked is the richness of these scanned images for more sophisticated feature extraction approaches coupled with machine learning, and ultimately correlation with molecular features and clinical endpoints. Here, we document initial attempts to exploit TCGA imaging archives, and describe some of the tools, and the rapidly evolving image analysis/feature extraction landscape. Our hope is to inform, and ultimately inspire and challenge, the pathology and cancer research communities to exploit these imaging resources so that the full potential of this integral platform of TCGA can be used to complement and enhance the insightful integrated analyses from the genomic and epigenomic platforms. Copyright © 2017 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.
癌症基因组图谱 (TCGA) 是几个致力于对选定肿瘤类型进行全面基因组和表观基因组分析的国际联盟之一,旨在增进我们对疾病的认识,并为全球癌症研究提供一个开放获取的资源。33 种肿瘤类型(通过组织学或组织来源选择,包括常见和罕见疾病),包含超过 11000 个标本,进行了 DNA 测序、拷贝数和甲基化分析以及转录组、蛋白质组和组织学评估。对每种癌症类型进行单独分析,以确定组织特异性改变,并在不同分子平台之间进行相关性分析。然后对最终数据集进行归一化和组合,用于泛癌症倡议,旨在识别不同癌症类型或起源/谱系的细胞之间的共同特征,或在解剖学上或形态上相关的组内。与丰富的分子研究一起生成的一个重要资源是广泛的数字病理学幻灯片档案,由与作为 TCGA 分析一部分的组织直接相关的冷冻切片组织以及代表福尔马林固定石蜡包埋、苏木精和伊红 (H&E) 染色的诊断幻灯片组成。这些 H&E 图像资源主要用于验证诊断和组织学亚型,对有丝分裂活性、分级和淋巴细胞浸润等一些有限的标准病理变量进行了一定程度的提取。很大程度上被忽视的是这些扫描图像的丰富性,可用于更复杂的特征提取方法与机器学习相结合,最终与分子特征和临床终点相关联。在这里,我们记录了利用 TCGA 成像档案的初步尝试,并描述了一些工具,以及快速发展的图像分析/特征提取领域。我们希望告知病理和癌症研究界利用这些成像资源,以便充分利用 TCGA 这一不可或缺平台的全部潜力,补充和增强来自基因组和表观基因组平台的有见地的综合分析,并最终激发和挑战他们。版权所有 © 2017 大不列颠及爱尔兰病理学学会。由 John Wiley & Sons, Ltd 出版。