Suppr超能文献

基于序列的机器学习预测半胱氨酸反应性

Sequence-Based Prediction of Cysteine Reactivity Using Machine Learning.

作者信息

Wang Haobo, Chen Xuemin, Li Can, Liu Yuan, Yang Fan, Wang Chu

机构信息

Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University , Beijing 100871, China.

Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University , Beijing 100871, China.

出版信息

Biochemistry. 2018 Jan 30;57(4):451-460. doi: 10.1021/acs.biochem.7b00897. Epub 2017 Oct 26.

Abstract

As one of the most intrinsically reactive amino acids, cysteine carries a variety of important biochemical functions, including catalysis and redox regulation. Discovery and characterization of cysteines with heightened reactivity will help annotate protein functions. Chemical proteomic methods have been used to quantitatively profile cysteine reactivity in native proteomes, showing a strong correlation between the chemical reactivity of a cysteine and its functionality; however, the relationship between the cysteine reactivity and its local sequence has not yet been systematically explored. Herein, we report a machine learning method, sbPCR (sequence-based prediction of cysteine reactivity), which combines the basic local alignment search tool, truncated composition of k-spaced amino acid pair analysis, and support vector machine to predict cysteines with hyper-reactivity based on only local sequence features. Using a benchmark set compiled from hyper-reactive cysteines in human proteomes, our method can achieve a prediction accuracy of 98%, a precision of 95%, and a recall ratio of 89%. We utilized these governing features of local sequence motifs to expand the prediction to potential hyper-reactive cysteines in other proteomes deposited in the UniProt database. We validated our predictions in Escherichia coli by activity-based protein profiling and discovered a hyper-reactive cysteine from a functionally uncharacterized protein, YecH. Biochemical analysis suggests that the hyper-reactive cysteine might be involved in metal binding. Our computational method provides a large inventory of potential hyper-reactive cysteines in proteomes and is highly complementary to other experimental approaches to guide systematic annotation of protein functions in the postgenome era.

摘要

作为最具内在反应性的氨基酸之一,半胱氨酸具有多种重要的生化功能,包括催化作用和氧化还原调节。发现并表征具有更高反应性的半胱氨酸将有助于注释蛋白质功能。化学蛋白质组学方法已被用于对天然蛋白质组中的半胱氨酸反应性进行定量分析,结果表明半胱氨酸的化学反应性与其功能之间存在很强的相关性;然而,半胱氨酸反应性与其局部序列之间的关系尚未得到系统探索。在此,我们报告了一种机器学习方法sbPCR(基于序列的半胱氨酸反应性预测),该方法结合了基本局部比对搜索工具、k间隔氨基酸对截断组成分析和支持向量机,仅基于局部序列特征预测具有高反应性的半胱氨酸。使用从人类蛋白质组中的高反应性半胱氨酸编译的基准集,我们的方法可以实现98%的预测准确率、95%的精确率和89%的召回率。我们利用这些局部序列基序的主导特征,将预测扩展到UniProt数据库中其他蛋白质组中的潜在高反应性半胱氨酸。我们通过基于活性的蛋白质谱分析在大肠杆菌中验证了我们的预测,并从功能未表征的蛋白质YecH中发现了一个高反应性半胱氨酸。生化分析表明,这个高反应性半胱氨酸可能参与金属结合。我们的计算方法提供了蛋白质组中大量潜在的高反应性半胱氨酸,并且与其他实验方法高度互补,可在后基因组时代指导蛋白质功能的系统注释。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验