Upadhyay Atul Kumar, Sowdhamini Ramanathan
National Centre for Biological Sciences (TIFR), GKVK Campus, Bangalore, India.
Division of Bioinformatics, School of Bioengineering and Biosciences, Lovely Professional University, Phagwara, India.
Bioinform Biol Insights. 2019 Jan 9;13:1177932218821362. doi: 10.1177/1177932218821362. eCollection 2019.
Computational approaches to high-throughput data are gaining importance because of explosion of sequences in the post-genomic era. This explosion of sequence data creates a huge gap among the domains of sequence structure and function, since the experimental techniques to determine the structure and function are very expensive, time taking, and laborious in nature. Therefore, there is an urgent need to emphasize on the development of computational approaches in the field of biological systems. Engagement of proteins in quaternary arrangements, such as domain swapping, might be relevant for higher compatibility of such genes at stress conditions. In this study, the capacity to engage in domain swapping was predicted from mere sequence information in the whole genome of holy Basil (), which is well known to be an anti-stress agent. Approximately, one-fourth of the proteins of are predicted to undergo three-dimensional (3D)-domain swapping. Furthermore, function annotation was carried out on all the predicted domain-swap sequences from the and for their distribution in different Pfam protein families and gene ontology (GO) terms. These domain-swapped protein sequences are associated with many Pfam protein families with a wide range of GO annotation terms. A comparative analysis of domain-swap-predicted sequences in with gene products in reveals that around 26% (2522 sequences) are close homologues across the 2 genomes. Functional annotation of predicted domain-swapped sequences infers that predicted domain-swap sequences are involved in diverse molecular functions, such as in gene regulation of abiotic stress conditions and adaptation to different environmental niches. Finally, the positively predicted sequences of and were also examined for their presence in stress regulome, as recorded in our STIFDB database, to check the involvement of these proteins in different abiotic stresses.
由于后基因组时代序列的爆炸式增长,高通量数据的计算方法正变得越来越重要。这种序列数据的爆炸式增长在序列结构和功能领域之间造成了巨大差距,因为确定结构和功能的实验技术非常昂贵、耗时且费力。因此,迫切需要强调生物系统领域计算方法的发展。蛋白质参与四级排列,如结构域交换,可能与这些基因在应激条件下的更高兼容性有关。在本研究中,从圣罗勒(Ocimum tenuiflorum)全基因组的单纯序列信息预测了参与结构域交换的能力,圣罗勒是一种众所周知的抗应激剂。大约四分之一的Ocimum tenuiflorum蛋白质预计会发生三维(3D)结构域交换。此外,对Ocimum tenuiflorum和Ocimum basilicum所有预测的结构域交换序列进行了功能注释,以了解它们在不同Pfam蛋白质家族和基因本体(GO)术语中的分布。这些结构域交换的蛋白质序列与许多具有广泛GO注释术语的Pfam蛋白质家族相关。对Ocimum tenuiflorum中结构域交换预测序列与Ocimum basilicum中基因产物的比较分析表明,在这两个基因组中,约26%(2522个序列)是紧密同源物。预测的结构域交换序列的功能注释推断,预测的结构域交换序列参与多种分子功能,如非生物胁迫条件下的基因调控和对不同环境生态位的适应。最后,还检查了Ocimum tenuiflorum和Ocimum basilicum的阳性预测序列在我们的STIFDB数据库中记录的应激调控组中的存在情况,以检查这些蛋白质在不同非生物胁迫中的参与情况。