Biochemistry Department, Pasteur Institute of Iran, Tehran, Iran.
Molecular Biology Department, Pasteur Institute of Iran, Tehran, Iran.
Mol Genet Genomic Med. 2020 May;8(5):e1219. doi: 10.1002/mgg3.1219. Epub 2020 Mar 10.
In the human genome, the transcription factors (TFs) and transcription factor-binding sites (TFBSs) network has a great regulatory function in the biological pathways. Such crosstalk might be affected by the single-nucleotide polymorphisms (SNPs), which could create or disrupt a TFBS, leading to either a disease or a phenotypic defect. Many computational resources have been introduced to predict the TFs binding variations due to SNPs inside TFBSs, sTRAP being one of them.
A literature review was performed and the experimental data for 18 TFBSs located in 12 genes was provided. The sequences of TFBS motifs were extracted using two different strategies; in the size similar with synthetic target sites used in the experimental techniques, and with 60 bp upstream and downstream of the SNPs. The sTRAP (http://trap.molgen.mpg.de/cgi-bin/trap_two_seq_form.cgi) was applied to compute the binding affinity scores of their cognate TFs in the context of reference and mutant sequences of TFBSs. The alternative bioinformatics model used in this study was regulatory analysis of variation in enhancers (RAVEN; http://www.cisreg.ca/cgi-bin/RAVEN/a). The bioinformatics outputs of our study were compared with experimental data, electrophoretic mobility shift assay (EMSA).
In 6 out of 18 TFBSs in the following genes COL1A1, Hb ḉᴪ, TF, FIX, MBL2, NOS2A, the outputs of sTRAP were inconsistent with the results of EMSA. Furthermore, no p value of the difference between the two scores of binding affinity under the wild and mutant conditions of TFBSs was presented. Nor, were any criteria for preference or selection of any of the measurements of different matrices used for the same analysis.
Our preliminary study indicated some paradoxical results between sTRAP and experimental data. However, to link the data of sTRAP to the biological functions, its optimization via experimental procedures with the integration of expanded data and applying several other bioinformatics tools might be required.
在人类基因组中,转录因子(TFs)和转录因子结合位点(TFBSs)网络在生物途径中具有重要的调节功能。这种串扰可能会受到单核苷酸多态性(SNPs)的影响,这些 SNPs 可能会创建或破坏 TFBS,从而导致疾病或表型缺陷。已经引入了许多计算资源来预测由于 SNPs 在内的 TFBS 内的 TF 结合变化,sTRAP 就是其中之一。
进行了文献综述,并提供了位于 12 个基因中的 18 个 TFBS 的实验数据。使用两种不同的策略提取 TFBS 基序的序列;一种策略是与实验技术中使用的合成靶标大小相似,另一种策略是在 SNPs 的上下游提取 60bp。sTRAP(http://trap.molgen.mpg.de/cgi-bin/trap_two_seq_form.cgi)用于计算 TFBS 参考和突变序列中同源 TF 的结合亲和力分数。本研究中使用的替代生物信息学模型是增强子变异的调控分析(RAVEN;http://www.cisreg.ca/cgi-bin/RAVEN/a)。我们研究的生物信息学输出与实验数据,电泳迁移率变动分析(EMSA)进行了比较。
在以下基因中的 6 个 TFBS 中,COL1A1、Hb ḉᴪ、TF、FIX、MBL2、NOS2A,sTRAP 的输出与 EMSA 的结果不一致。此外,也没有提供 TFBS 野生和突变条件下结合亲和力的两个分数之间差异的 p 值。也没有为相同分析中使用的不同矩阵的任何测量值的偏好或选择提供任何标准。
我们的初步研究表明,sTRAP 和实验数据之间存在一些矛盾的结果。然而,要将 sTRAP 的数据与生物学功能联系起来,可能需要通过实验程序进行优化,并整合扩展的数据以及应用其他一些生物信息学工具。