Suppr超能文献

哺乳动物粘蛋白型O-糖基化位点的预测、保守性分析及结构表征

Prediction, conservation analysis, and structural characterization of mammalian mucin-type O-glycosylation sites.

作者信息

Julenius Karin, Mølgaard Anne, Gupta Ramneek, Brunak Søren

机构信息

Center for Biological Sequence Analysis, BioCentrum, Building 208, Technical University of Denmark, DK-2800 Lyngby, Denmark.

出版信息

Glycobiology. 2005 Feb;15(2):153-64. doi: 10.1093/glycob/cwh151. Epub 2004 Sep 22.

Abstract

O-GalNAc-glycosylation is one of the main types of glycosylation in mammalian cells. No consensus recognition sequence for the O-glycosyltransferases is known, making prediction methods necessary to bridge the gap between the large number of known protein sequences and the small number of proteins experimentally investigated with regard to glycosylation status. From O-GLYCBASE a total of 86 mammalian proteins experimentally investigated for in vivo O-GalNAc sites were extracted. Mammalian protein homolog comparisons showed that a glycosylated serine or threonine is less likely to be precisely conserved than a nonglycosylated one. The Protein Data Bank was analyzed for structural information, and 12 glycosylated structures were obtained. All positive sites were found in coil or turn regions. A method for predicting the location for mucin-type glycosylation sites was trained using a neural network approach. The best overall network used as input amino acid composition, averaged surface accessibility predictions together with substitution matrix profile encoding of the sequence. To improve prediction on isolated (single) sites, networks were trained on isolated sites only. The final method combines predictions from the best overall network and the best isolated site network; this prediction method correctly predicted 76% of the glycosylated residues and 93% of the nonglycosylated residues. NetOGlyc 3.1 can predict sites for completely new proteins without losing its performance. The fact that the sites could be predicted from averaged properties together with the fact that glycosylation sites are not precisely conserved indicates that mucin-type glycosylation in most cases is a bulk property and not a very site-specific one. NetOGlyc 3.1 is made available at www.cbs.dtu.dk/services/netoglyc.

摘要

O-连接的N-乙酰半乳糖胺糖基化是哺乳动物细胞中主要的糖基化类型之一。目前尚不清楚O-糖基转移酶的共有识别序列,因此需要预测方法来弥合大量已知蛋白质序列与少数经实验研究糖基化状态的蛋白质之间的差距。从O-GLYCBASE数据库中总共提取了86种经实验研究体内O-连接的N-乙酰半乳糖胺位点的哺乳动物蛋白质。哺乳动物蛋白质同源性比较表明,糖基化的丝氨酸或苏氨酸比非糖基化的丝氨酸或苏氨酸更不容易精确保守。对蛋白质数据库进行了结构信息分析,获得了12个糖基化结构。所有阳性位点均位于卷曲或转角区域。使用神经网络方法训练了一种预测粘蛋白型糖基化位点位置的方法。最佳的整体网络使用氨基酸组成、平均表面可及性预测以及序列的替代矩阵概况编码作为输入。为了提高对孤立(单个)位点的预测,仅在孤立位点上训练网络。最终方法结合了最佳整体网络和最佳孤立位点网络的预测;这种预测方法正确预测了76%的糖基化残基和93%的非糖基化残基。NetOGlyc 3.1可以预测全新蛋白质的位点而不损失其性能。位点可以从平均性质预测这一事实,以及糖基化位点并非精确保守这一事实,表明在大多数情况下,粘蛋白型糖基化是一种整体性质,而不是非常位点特异性的性质。NetOGlyc 3.1可在www.cbs.dtu.dk/services/netoglyc上获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验