Narad Priyanka, Kumar Abhishek, Chakraborty Amlan, Patni Pranav, Sengupta Abhishek, Wadhwa Gulshan, Upadhyaya K C
Amity Institute of Biotechnology, Amity University Uttar Pradesh, Sector-125, Noida, 201303, India.
Department of Biotechnology, Ministry of Science & Technology, New Delhi, India.
Interdiscip Sci. 2017 Sep;9(3):378-391. doi: 10.1007/s12539-016-0168-5. Epub 2016 Apr 6.
Transcription factors are trans-acting proteins that interact with specific nucleotide sequences known as transcription factor binding site (TFBS), and these interactions are implicated in regulation of the gene expression. Regulation of transcriptional activation of a gene often involves multiple interactions of transcription factors with various sequence elements. Identification of these sequence elements is the first step in understanding the underlying molecular mechanism(s) that regulate the gene expression. For in silico identification of these sequence elements, we have developed an online computational tool named transcription factor information system (TFIS) for detecting TFBS for the first time using a collection of JAVA programs and is mainly based on TFBS detection using position weight matrix (PWM). The database used for obtaining position frequency matrices (PFM) is JASPAR and HOCOMOCO, which is an open-access database of transcription factor binding profiles. Pseudo-counts are used while converting PFM to PWM, and TFBS detection is carried out on the basis of percent score taken as threshold value. TFIS is equipped with advanced features such as direct sequence retrieving from NCBI database using gene identification number and accession number, detecting binding site for common TF in a batch of gene sequences, and TFBS detection after generating PWM from known raw binding sequences in addition to general detection methods. TFIS can detect the presence of potential TFBSs in both the directions at the same time. This feature increases its efficiency. And the results for this dual detection are presented in different colors specific to the orientation of the binding site. Results obtained by the TFIS are more detailed and specific to the detected TFs as integration of more informative links from various related web servers are added in the result pages like Gene Ontology, PAZAR database and Transcription Factor Encyclopedia in addition to NCBI and UniProt. Common TFs like SP1, AP1 and NF-KB of the Amyloid beta precursor gene is easily detected using TFIS along with multiple binding sites. In another scenario of embryonic developmental process, TFs of the FOX family (FOXL1 and FOXC1) were also identified. TFIS is platform-independent which is publicly available along with its support and documentation at http://tfistool.appspot.com and http://www.bioinfoplus.com/tfis/ . TFIS is licensed under the GNU General Public License, version 3 (GPL-3.0).
转录因子是反式作用蛋白,它们与被称为转录因子结合位点(TFBS)的特定核苷酸序列相互作用,并且这些相互作用与基因表达的调控有关。基因转录激活的调控通常涉及转录因子与各种序列元件的多重相互作用。鉴定这些序列元件是理解调控基因表达的潜在分子机制的第一步。为了通过计算机鉴定这些序列元件,我们开发了一个名为转录因子信息系统(TFIS)的在线计算工具,它首次使用一组JAVA程序来检测TFBS,并且主要基于使用位置权重矩阵(PWM)进行TFBS检测。用于获取位置频率矩阵(PFM)的数据库是JASPAR和HOCOMOCO,这是一个转录因子结合谱的开放获取数据库。在将PFM转换为PWM时使用伪计数,并且基于作为阈值的百分比得分进行TFBS检测。TFIS具备先进的功能,除了常规检测方法外,还包括使用基因识别号和登录号从NCBI数据库直接检索序列、在一批基因序列中检测常见转录因子的结合位点,以及从已知的原始结合序列生成PWM后进行TFBS检测。TFIS可以同时在两个方向上检测潜在TFBS的存在。这一特性提高了其效率。并且这种双重检测的结果以特定于结合位点方向的不同颜色呈现。TFIS获得的结果对于检测到的转录因子更加详细和具体,因为除了NCBI和UniProt之外,结果页面中还添加了来自各种相关网络服务器(如基因本体论、PAZAR数据库和转录因子百科全书)的更多信息链接。使用TFIS可以轻松检测淀粉样前体蛋白基因的常见转录因子,如SP1、AP1和NF-κB以及多个结合位点。在胚胎发育过程的另一种情况下,还鉴定出了FOX家族的转录因子(FOXL1和FOXC1)。TFIS与平台无关,可在http://tfistool.appspot.com和http://www.bioinfoplus.com/tfis/ 上公开获取,同时还提供支持和文档。TFIS根据GNU通用公共许可证第3版(GPL-3.0)获得许可。