• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用随机森林算法显著提高了磺酪氨酸位点的预测准确性。

Predicting sulfotyrosine sites using the random forest algorithm with significantly improved prediction accuracy.

机构信息

School of Biosciences, University of Exeter, Exeter EX4 5DE, UK.

出版信息

BMC Bioinformatics. 2009 Oct 29;10:361. doi: 10.1186/1471-2105-10-361.

DOI:10.1186/1471-2105-10-361
PMID:19874585
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2777180/
Abstract

BACKGROUND

Tyrosine sulfation is one of the most important posttranslational modifications. Due to its relevance to various disease developments, tyrosine sulfation has become the target for drug design. In order to facilitate efficient drug design, accurate prediction of sulfotyrosine sites is desirable. A predictor published seven years ago has been very successful with claimed prediction accuracy of 98%. However, it has a particularly low sensitivity when predicting sulfotyrosine sites in some newly sequenced proteins.

RESULTS

A new approach has been developed for predicting sulfotyrosine sites using the random forest algorithm after a careful evaluation of seven machine learning algorithms. Peptides are formed by consecutive residues symmetrically flanking tyrosine sites. They are then encoded using an amino acid hydrophobicity scale. This new approach has increased the sensitivity by 22%, the specificity by 3%, and the total prediction accuracy by 10% compared with the previous predictor using the same blind data. Meanwhile, both negative and positive predictive powers have been increased by 9%. In addition, the random forest model has an excellent feature for ranking the residues flanking tyrosine sites, hence providing more information for further investigating the tyrosine sulfation mechanism. A web tool has been implemented at http://ecsb.ex.ac.uk/sulfotyrosine for public use.

CONCLUSION

The random forest algorithm is able to deliver a better model compared with the Hidden Markov Model, the support vector machine, artificial neural networks, and others for predicting sulfotyrosine sites. The success shows that the random forest algorithm together with an amino acid hydrophobicity scale encoding can be a good candidate for peptide classification.

摘要

背景

酪氨酸硫酸化是最重要的翻译后修饰之一。由于其与各种疾病发展的相关性,酪氨酸硫酸化已成为药物设计的目标。为了促进高效的药物设计,准确预测硫酸酪氨酸位点是理想的。七年前发表的一个预测器在声称的预测精度为 98%方面非常成功。然而,在预测一些新测序蛋白质中的硫酸酪氨酸位点时,它的灵敏度特别低。

结果

在仔细评估了七种机器学习算法之后,我们使用随机森林算法开发了一种新的预测硫酸酪氨酸位点的方法。肽由酪氨酸位点两侧连续的残基形成。然后,它们使用氨基酸疏水性尺度进行编码。与使用相同盲数据的先前预测器相比,这种新方法将灵敏度提高了 22%,特异性提高了 3%,总预测精度提高了 10%。同时,阴性和阳性预测值都提高了 9%。此外,随机森林模型具有很好的功能,可以对酪氨酸位点周围的残基进行排序,从而为进一步研究酪氨酸硫酸化机制提供更多信息。一个网络工具已在 http://ecsb.ex.ac.uk/sulfotyrosine 上实现,供公众使用。

结论

与隐马尔可夫模型、支持向量机、人工神经网络等相比,随机森林算法能够为预测硫酸酪氨酸位点提供更好的模型。成功表明,随机森林算法结合氨基酸疏水性尺度编码可以成为肽分类的一个很好的候选者。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3955/2777180/24549dae0af0/1471-2105-10-361-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3955/2777180/5d2a5b30c4a2/1471-2105-10-361-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3955/2777180/93ccdb9a30ba/1471-2105-10-361-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3955/2777180/7fb56ed48247/1471-2105-10-361-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3955/2777180/08da50885799/1471-2105-10-361-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3955/2777180/6b574fff855b/1471-2105-10-361-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3955/2777180/3f7fad2d2591/1471-2105-10-361-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3955/2777180/24549dae0af0/1471-2105-10-361-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3955/2777180/5d2a5b30c4a2/1471-2105-10-361-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3955/2777180/93ccdb9a30ba/1471-2105-10-361-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3955/2777180/7fb56ed48247/1471-2105-10-361-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3955/2777180/08da50885799/1471-2105-10-361-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3955/2777180/6b574fff855b/1471-2105-10-361-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3955/2777180/3f7fad2d2591/1471-2105-10-361-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3955/2777180/24549dae0af0/1471-2105-10-361-7.jpg

相似文献

1
Predicting sulfotyrosine sites using the random forest algorithm with significantly improved prediction accuracy.使用随机森林算法显著提高了磺酪氨酸位点的预测准确性。
BMC Bioinformatics. 2009 Oct 29;10:361. doi: 10.1186/1471-2105-10-361.
2
PredSulSite: prediction of protein tyrosine sulfation sites with multiple features and analysis.PredSulSite:具有多种特征的蛋白质酪氨酸硫酸化位点预测及分析。
Anal Biochem. 2012 Sep 1;428(1):16-23. doi: 10.1016/j.ab.2012.06.003. Epub 2012 Jun 9.
3
Prediction of protein binding sites in protein structures using hidden Markov support vector machine.利用隐马尔可夫支持向量机预测蛋白质结构中的蛋白质结合位点。
BMC Bioinformatics. 2009 Nov 20;10:381. doi: 10.1186/1471-2105-10-381.
4
Prediction of RNA-binding residues in proteins from primary sequence using an enriched random forest model with a novel hybrid feature.基于新型混合特征的富集随机森林模型预测蛋白质中 RNA 结合残基的一级序列
Proteins. 2011 Apr;79(4):1230-9. doi: 10.1002/prot.22958. Epub 2011 Jan 25.
5
Incorporating support vector machine for identifying protein tyrosine sulfation sites.整合支持向量机用于识别蛋白质酪氨酸硫酸化位点。
J Comput Chem. 2009 Nov 30;30(15):2526-37. doi: 10.1002/jcc.21258.
6
Accurate in silico identification of protein succinylation sites using an iterative semi-supervised learning technique.使用迭代半监督学习技术在计算机上准确识别蛋白质琥珀酰化位点
J Theor Biol. 2015 Jun 7;374:60-5. doi: 10.1016/j.jtbi.2015.03.029. Epub 2015 Apr 2.
7
Computational Prediction and Analysis for Tyrosine Post-Translational Modifications via Elastic Net.通过弹性网络进行酪氨酸翻译后修饰的计算预测和分析。
J Chem Inf Model. 2018 Jun 25;58(6):1272-1281. doi: 10.1021/acs.jcim.7b00688. Epub 2018 May 18.
8
Prediction of tyrosine sulfation with mRMR feature selection and analysis.酪氨酸硫酸化的预测与 mRMR 特征选择和分析。
J Proteome Res. 2010 Dec 3;9(12):6490-7. doi: 10.1021/pr1007152. Epub 2010 Nov 11.
9
PredNTS: Improved and Robust Prediction of Nitrotyrosine Sites by Integrating Multiple Sequence Features.PredNTS:通过整合多种序列特征提高和增强对硝化酪氨酸位点的预测。
Int J Mol Sci. 2021 Mar 8;22(5):2704. doi: 10.3390/ijms22052704.
10
Predicting RNA-binding sites of proteins using support vector machines and evolutionary information.使用支持向量机和进化信息预测蛋白质的RNA结合位点。
BMC Bioinformatics. 2008 Dec 12;9 Suppl 12(Suppl 12):S6. doi: 10.1186/1471-2105-9-S12-S6.

引用本文的文献

1
Identification of tyrosine sulfation in the variable region of a bispecific antibody and its effect on stability and biological activity.鉴定双特异性抗体可变区中的酪氨酸硫酸化及其对稳定性和生物学活性的影响。
MAbs. 2023 Jan-Dec;15(1):2259289. doi: 10.1080/19420862.2023.2259289. Epub 2023 Sep 24.
2
A potential antibody repertoire diversification mechanism through tyrosine sulfation for biotherapeutics engineering and production.通过酪氨酸硫酸化实现生物治疗工程和生产中潜在的抗体库多样化机制。
Front Immunol. 2022 Dec 8;13:1072702. doi: 10.3389/fimmu.2022.1072702. eCollection 2022.
3
In silico prediction of post-translational modifications in therapeutic antibodies.

本文引用的文献

1
Clinical discriminations and neuropsychological tests: An appeal to bayes' theorem.临床鉴别与神经心理学测试:对贝叶斯定理的呼吁。
Clin Neuropsychol. 1993 Apr;7(2):224-233. doi: 10.1080/13854049308401527.
2
Prediction of interactions between HIV-1 and human proteins by information integration.通过信息整合预测HIV-1与人类蛋白质之间的相互作用。
Pac Symp Biocomput. 2009:516-27.
3
Identification of differential gene expression for microarray data using recursive random forest.使用递归随机森林识别微阵列数据中的差异基因表达
治疗性抗体中翻译后修饰的计算预测。
MAbs. 2022 Jan-Dec;14(1):2023938. doi: 10.1080/19420862.2021.2023938.
4
Characterization and prediction of positional 4-hydroxyproline and sulfotyrosine, two post-translational modifications that can occur at substantial levels in CHO cells-expressed biotherapeutics.鉴定和预测定位 4-羟脯氨酸和磺基酪氨酸,这两种翻译后修饰可以在 CHO 细胞表达的治疗性生物制剂中大量出现。
MAbs. 2019 Oct;11(7):1219-1232. doi: 10.1080/19420862.2019.1635865. Epub 2019 Jul 24.
5
A Novel Phosphorylation Site-Kinase Network-Based Method for the Accurate Prediction of Kinase-Substrate Relationships.一种基于新型磷酸化位点-激酶网络的激酶-底物关系准确预测方法。
Biomed Res Int. 2017;2017:1826496. doi: 10.1155/2017/1826496. Epub 2017 Oct 12.
6
Small changes huge impact: the role of protein posttranslational modifications in cellular homeostasis and disease.微小变化,巨大影响:蛋白质翻译后修饰在细胞稳态与疾病中的作用
J Amino Acids. 2011;2011:207691. doi: 10.4061/2011/207691. Epub 2011 Jul 21.
Chin Med J (Engl). 2008 Dec 20;121(24):2492-6.
4
Diagnosis of ulcerative colitis before onset of inflammation by multivariate modeling of genome-wide gene expression data.通过全基因组基因表达数据的多变量建模在炎症发作前诊断溃疡性结肠炎
Inflamm Bowel Dis. 2009 Jul;15(7):1032-8. doi: 10.1002/ibd.20879.
5
Peptide bioinformatics: peptide classification using peptide machines.肽生物信息学:使用肽机器进行肽分类。
Methods Mol Biol. 2008;458:159-83. doi: 10.1007/978-1-60327-101-1_9.
6
Targeting heparan sulfate proteoglycans in breast cancer treatment.乳腺癌治疗中靶向硫酸乙酰肝素蛋白聚糖
Recent Pat Anticancer Drug Discov. 2008 Nov;3(3):151-8. doi: 10.2174/157489208786242278.
7
Immunohistochemical level of unsulfated chondroitin disaccharides in the cancer stroma is an independent predictor of prostate cancer relapse.癌基质中未硫酸化硫酸软骨素二糖的免疫组化水平是前列腺癌复发的独立预测指标。
Cancer Epidemiol Biomarkers Prev. 2008 Sep;17(9):2488-97. doi: 10.1158/1055-9965.EPI-08-0204.
8
Sulfotransferase 2B1b in human breast: differences in subcellular localization in African American and Caucasian women.人乳腺中的磺基转移酶2B1b:非裔美国女性和白人女性亚细胞定位的差异。
J Steroid Biochem Mol Biol. 2008 Sep;111(3-5):171-7. doi: 10.1016/j.jsbmb.2008.05.006. Epub 2008 Jun 8.
9
Increased expression of non-sulfated chondroitin correlates with adverse clinicopathological parameters in prostate cancer.非硫酸化软骨素表达增加与前列腺癌不良临床病理参数相关。
Mod Pathol. 2008 Jul;21(7):893-901. doi: 10.1038/modpathol.2008.70. Epub 2008 May 16.
10
On the sulfation and methylation of catecholestrogens in human mammary epithelial cells and breast cancer cells.关于儿茶酚雌激素在人乳腺上皮细胞和乳腺癌细胞中的硫酸化和甲基化
Biol Pharm Bull. 2008 Apr;31(4):769-73. doi: 10.1248/bpb.31.769.