College of Life Sciences, Nankai University, Tianjin, China.
Protein Cell. 2012 Aug;3(8):602-8. doi: 10.1007/s13238-012-2914-8. Epub 2012 Jul 21.
The giant panda is one of the most critically endangered species due to the fragmentation and loss of its habitat. Studying the functions of proteins in this animal, especially specific trait-related proteins, is therefore necessary to protect the species. In this work, the functions of these proteins were investigated using the genome sequence of the giant panda. Data on 21,001 proteins and their functions were stored in the Giant Panda Protein Database, in which the proteins were divided into two groups: 20,179 proteins whose functions can be predicted by GeneScan formed the known-function group, whereas 822 proteins whose functions cannot be predicted by GeneScan comprised the unknown-function group. For the known-function group, we further classified the proteins by molecular function, biological process, cellular component, and tissue specificity. For the unknown-function group, we developed a strategy in which the proteins were filtered by cross-Blast to identify panda-specific proteins under the assumption that proteins related to the panda-specific traits in the unknown-function group exist. After this filtering procedure, we identified 32 proteins (2 of which are membrane proteins) specific to the giant panda genome as compared against the dog and horse genomes. Based on their amino acid sequences, these 32 proteins were further analyzed by functional classification using SVM-Prot, motif prediction using MyHits, and interacting protein prediction using the Database of Interacting Proteins. Nineteen proteins were predicted to be zinc-binding proteins, thus affecting the activities of nucleic acids. The 32 panda-specific proteins will be further investigated by structural and functional analysis.
大熊猫是最濒危物种之一,由于其栖息地的破碎和丧失。因此,研究这种动物的蛋白质功能,尤其是与特定特征相关的蛋白质功能,对于保护该物种是必要的。在这项工作中,使用大熊猫基因组序列研究了这些蛋白质的功能。关于 21001 种蛋白质及其功能的数据存储在大熊猫蛋白质数据库中,其中蛋白质分为两组:20179 种蛋白质的功能可以通过 GeneScan 预测,形成已知功能组,而 822 种蛋白质的功能不能通过 GeneScan 预测,构成未知功能组。对于已知功能组,我们进一步根据分子功能、生物过程、细胞成分和组织特异性对蛋白质进行分类。对于未知功能组,我们开发了一种策略,通过跨 Blast 过滤来识别未知功能组中与大熊猫特定特征相关的蛋白质,假设该组中的蛋白质存在。经过这种过滤程序,我们在与狗和马基因组进行比较时,鉴定出 32 种(其中 2 种是膜蛋白)特有的大熊猫基因组蛋白。基于它们的氨基酸序列,这些 32 种蛋白质使用 SVM-Prot 进行功能分类分析,使用 MyHits 进行模体预测,以及使用互作蛋白数据库进行互作蛋白预测。19 种蛋白质被预测为锌结合蛋白,从而影响核酸的活性。这 32 种大熊猫特异性蛋白将进一步通过结构和功能分析进行研究。