Rathankar N, Nirmala K A, Khanduja Varun, Nagendra H G
Department of Bioinformatics, School of Bioengineering, SRM University, Kattankulathur, Tamil Nadu 603 203, India.
ISRN Neurol. 2011;2011:265253. doi: 10.5402/2011/265253. Epub 2011 Sep 4.
High-throughput genome sequencing has led to data explosion in sequence databanks, with an imbalance of sequence-structure-function relationships, resulting in a substantial fraction of proteins known as hypothetical proteins. Functions of such proteins can be assigned based on the analysis and characterization of the domains that they are made up of. Domains are basic evolutionary units of proteins and most proteins contain multiple domains. A subset of multidomain proteins is fused domains (overlapping domains), wherein sequence overlaps between two or more domains occur. These fused domains are a result of gene fusion events and their implication in diseases is well established. Hence, an attempt has been made in this paper to identify the fused domain containing hypothetical proteins from human genome homologous to parkinsonian targets present in KEGG database. The results of this research identified 18 hypothetical proteins, with domains fused with ubiquitin domains and having homology with targets present in parkinsonian pathway.
高通量基因组测序导致序列数据库中的数据爆炸式增长,序列-结构-功能关系失衡,从而产生了相当一部分被称为假设蛋白的蛋白质。此类蛋白质的功能可基于对其组成结构域的分析和表征来确定。结构域是蛋白质的基本进化单位,大多数蛋白质包含多个结构域。多结构域蛋白质的一个子集是融合结构域(重叠结构域),其中两个或更多结构域之间存在序列重叠。这些融合结构域是基因融合事件的结果,并且它们在疾病中的作用已得到充分证实。因此,本文尝试从人类基因组中识别与KEGG数据库中帕金森病靶点同源的含融合结构域的假设蛋白。这项研究的结果鉴定出18种假设蛋白,其结构域与泛素结构域融合,并且与帕金森病通路中的靶点具有同源性。