Institute of Intelligent System and Bioinformatics, College of Automation, Harbin Engineering University, Harbin, Heilongjiang, China.
Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, USA.
BMC Med Genomics. 2020 Dec 28;13(Suppl 11):195. doi: 10.1186/s12920-020-00828-4.
Existing studies have demonstrated that the integrative analysis of histopathological images and genomic data can be used to better understand the onset and progression of many diseases, as well as identify new diagnostic and prognostic biomarkers. However, since the development of pathological phenotypes are influenced by a variety of complex biological processes, complete understanding of the underlying gene regulatory mechanisms for the cell and tissue morphology is still a challenge. In this study, we explored the relationship between the chromatin accessibility changes and the epithelial tissue proportion in histopathological images of estrogen receptor (ER) positive breast cancer.
An established whole slide image processing pipeline based on deep learning was used to perform global segmentation of epithelial and stromal tissues. We then used canonical correlation analysis to detect the epithelial tissue proportion-associated regulatory regions. By integrating ATAC-seq data with matched RNA-seq data, we found the potential target genes that associated with these regulatory regions. Then we used these genes to perform the following pathway and survival analysis.
Using canonical correlation analysis, we detected 436 potential regulatory regions that exhibited significant correlation between quantitative chromatin accessibility changes and the epithelial tissue proportion in tumors from 54 patients (FDR < 0.05). We then found that these 436 regulatory regions were associated with 74 potential target genes. After functional enrichment analysis, we observed that these potential target genes were enriched in cancer-associated pathways. We further demonstrated that using the gene expression signals and the epithelial tissue proportion extracted from this integration framework could stratify patient prognoses more accurately, outperforming predictions based on only omics or image features.
This integrative analysis is a useful strategy for identifying potential regulatory regions in the human genome that are associated with tumor tissue quantification. This study will enable efficient prioritization of genomic regulatory regions identified by ATAC-seq data for further studies to validate their causal regulatory function. Ultimately, identifying epithelial tissue proportion-associated regulatory regions will further our understanding of the underlying molecular mechanisms of disease and inform the development of potential therapeutic targets.
现有研究表明,对组织病理学图像和基因组数据进行整合分析,可帮助我们更好地理解许多疾病的发病和进展过程,并识别新的诊断和预后生物标志物。然而,由于病理表型的发展受到多种复杂的生物学过程的影响,因此,要完全了解细胞和组织形态的潜在基因调控机制仍然具有挑战性。在这项研究中,我们探讨了雌激素受体(ER)阳性乳腺癌组织病理学图像中染色质可及性变化与上皮组织比例之间的关系。
我们使用基于深度学习的成熟全切片图像处理流水线来进行上皮组织和基质组织的全局分割。然后,我们使用典型相关分析来检测与上皮组织比例相关的调节区域。通过整合 ATAC-seq 数据与匹配的 RNA-seq 数据,我们找到了与这些调节区域相关的潜在靶基因。随后,我们使用这些基因进行通路和生存分析。
使用典型相关分析,我们在 54 名患者的肿瘤中检测到 436 个潜在的调节区域,这些区域的定量染色质可及性变化与肿瘤中的上皮组织比例之间存在显著相关性(FDR<0.05)。我们发现,这 436 个调节区域与 74 个潜在的靶基因相关。经过功能富集分析,我们观察到这些潜在的靶基因富集在与癌症相关的通路中。我们进一步证明,使用从这个整合框架中提取的基因表达信号和上皮组织比例,可以更准确地对患者的预后进行分层,优于仅基于组学或图像特征的预测。
这种整合分析是一种识别与肿瘤组织定量相关的人类基因组中潜在调节区域的有效策略。本研究将有助于对 ATAC-seq 数据鉴定的基因组调节区域进行高效的优先级排序,以便进一步验证其因果调节功能。最终,识别与上皮组织比例相关的调节区域将有助于深入了解疾病的潜在分子机制,并为潜在治疗靶点的开发提供信息。