Department of Biological Sciences, National University of Singapore, Singapore, Singapore.
Cancer and Stem Cell Biology, and Centre for Computational Biology, Duke-NUS Medical School, Singapore, Singapore.
BMC Genomics. 2022 Jul 20;23(Suppl 1):525. doi: 10.1186/s12864-022-08565-x.
The transforming growth factor beta-1 (TGF β-1) cytokine exerts both pro-tumor and anti-tumor effects in carcinogenesis. An increasing body of literature suggests that TGF β-1 signaling outcome is partially dependent on the regulatory targets of downstream receptor-regulated Smad (R-Smad) proteins Smad2 and Smad3. However, the lack of Smad-specific antibodies for ChIP-seq hinders convenient identification of Smad-specific binding sites.
In this study, we use localization and affinity purification (LAP) tags to identify Smad-specific binding sites in a cancer cell line. Using ChIP-seq data obtained from LAP-tagged Smad proteins, we develop a convolutional neural network with long-short term memory (CNN-LSTM) as a deep learning approach to classify a pool of Smad-bound sites as being Smad2- or Smad3-bound. Our data showed that this approach is able to accurately classify Smad2- versus Smad3-bound sites. We use our model to dissect the role of each R-Smad in the progression of breast cancer using a previously published dataset.
Our results suggests that deep learning approaches can be used to dissect binding site specificity of closely related transcription factors.
转化生长因子 β-1(TGFβ-1)细胞因子在癌变过程中既具有促肿瘤作用,也具有抗肿瘤作用。越来越多的文献表明,TGFβ-1 信号转导的结果部分取决于下游受体调节的 Smad(R-Smad)蛋白 Smad2 和 Smad3 的调节靶标。然而,缺乏用于 ChIP-seq 的 Smad 特异性抗体阻碍了对 Smad 特异性结合位点的方便鉴定。
在这项研究中,我们使用定位和亲和纯化 (LAP) 标签来鉴定癌细胞系中的 Smad 特异性结合位点。使用从 LAP 标记的 Smad 蛋白获得的 ChIP-seq 数据,我们开发了一个具有长短期记忆 (CNN-LSTM) 的卷积神经网络,作为一种深度学习方法,将一组 Smad 结合位点分类为 Smad2 或 Smad3 结合。我们的数据表明,该方法能够准确地对 Smad2 与 Smad3 结合位点进行分类。我们使用该模型使用先前发表的数据集来剖析每个 R-Smad 在乳腺癌进展中的作用。
我们的结果表明,深度学习方法可用于剖析密切相关的转录因子的结合位点特异性。