Hansen Sara Fasmer, Bettler Emmanuel, Wimmerová Michaela, Imberty Anne, Lerouxel Olivier, Breton Christelle
University of Grenoble, F-38041 Grenoble, France.
J Proteome Res. 2009 Feb;8(2):743-53. doi: 10.1021/pr800808m.
Approximately 450 glycosyltransferase (GT) sequences have been already identified in the Arabidopsis genome that organize into 40 sequence-based families, but a vast majority of these gene products remain biochemically uncharacterized open reading frames. Given the complexity of the cell wall carbohydrate network, it can be inferred that some of the biosynthetic genes have not yet been identified by classical bioinformatics approaches. With the objective to identify new plant GT genes, we designed a bioinformatic strategy that is based on the use of several remote homology detection methods that act at the 1D, 2D, and 3D level. Together, these methods led to the identification of more than 150 candidate protein sequences. Among them, 20 are considered as putative glycosyltransferases that should further be investigated since known GT signatures were clearly identified.
在拟南芥基因组中已鉴定出约450个糖基转移酶(GT)序列,它们可分为40个基于序列的家族,但这些基因产物中的绝大多数在生化方面仍为未表征的开放阅读框。鉴于细胞壁碳水化合物网络的复杂性,可以推断一些生物合成基因尚未通过经典的生物信息学方法鉴定出来。为了鉴定新的植物GT基因,我们设计了一种生物信息学策略,该策略基于使用几种在一维、二维和三维水平起作用的远程同源性检测方法。这些方法共同导致鉴定出150多个候选蛋白质序列。其中,20个被认为是推定的糖基转移酶,由于已明确鉴定出已知的GT特征,因此应进一步研究。