Suppr超能文献

AllerCatPro——从蛋白质序列预测蛋白质潜在致敏性

AllerCatPro-prediction of protein allergenicity potential from the protein sequence.

作者信息

Maurer-Stroh Sebastian, Krutz Nora L, Kern Petra S, Gunalan Vithiagaran, Nguyen Minh N, Limviphuvadh Vachiranee, Eisenhaber Frank, Gerberick G Frank

机构信息

Biomolecular Function Discovery Division, Bioinformatics Institute, Agency for Science, Technology and Research, Singapore.

Department of Biological Sciences, National University of Singapore, Singapore.

出版信息

Bioinformatics. 2019 Sep 1;35(17):3020-3027. doi: 10.1093/bioinformatics/btz029.

Abstract

MOTIVATION

Due to the risk of inducing an immediate Type I (IgE-mediated) allergic response, proteins intended for use in consumer products must be investigated for their allergenic potential before introduction into the marketplace. The FAO/WHO guidelines for computational assessment of allergenic potential of proteins based on short peptide hits and linear sequence window identity thresholds misclassify many proteins as allergens.

RESULTS

We developed AllerCatPro which predicts the allergenic potential of proteins based on similarity of their 3D protein structure as well as their amino acid sequence compared with a data set of known protein allergens comprising of 4180 unique allergenic protein sequences derived from the union of the major databases Food Allergy Research and Resource Program, Comprehensive Protein Allergen Resource, WHO/International Union of Immunological Societies, UniProtKB and Allergome. We extended the hexamer hit rule by removing peptides with high probability of random occurrence measured by sequence entropy as well as requiring 3 or more hexamer hits consistent with natural linear epitope patterns in known allergens. This is complemented with a Gluten-like repeat pattern detection. We also switched from a linear sequence window similarity to a B-cell epitope-like 3D surface similarity window which became possible through extensive 3D structure modeling covering the majority (74%) of allergens. In case no structure similarity is found, the decision workflow reverts to the old linear sequence window rule. The overall accuracy of AllerCatPro is 84% compared with other current methods which range from 51 to 73%. Both the FAO/WHO rules and AllerCatPro achieve highest sensitivity but AllerCatPro provides a 37-fold increase in specificity.

AVAILABILITY AND IMPLEMENTATION

https://allercatpro.bii.a-star.edu.sg/.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

由于存在引发即时I型(IgE介导)过敏反应的风险,用于消费品的蛋白质在投放市场之前必须对其致敏潜力进行研究。粮农组织/世卫组织基于短肽匹配和线性序列窗口同一性阈值对蛋白质致敏潜力进行计算评估的指南将许多蛋白质误分类为过敏原。

结果

我们开发了AllerCatPro,它通过将蛋白质的三维结构及其氨基酸序列与已知蛋白质过敏原数据集进行比较,来预测蛋白质的致敏潜力。该已知蛋白质过敏原数据集由来自主要数据库食品过敏研究与资源计划、综合蛋白质过敏原资源、世卫组织/国际免疫学会联盟、UniProtKB和Allergome的4180个独特的致敏蛋白质序列组成。我们扩展了六聚体匹配规则,去除了通过序列熵测量的随机出现概率高的肽段,并要求有3个或更多与已知过敏原中的天然线性表位模式一致的六聚体匹配。这辅以谷蛋白样重复模式检测。我们还从线性序列窗口相似性转变为B细胞表位样三维表面相似性窗口,这通过覆盖大多数(74%)过敏原的广泛三维结构建模得以实现。如果未发现结构相似性,决策工作流程将恢复为旧的线性序列窗口规则。与其他当前方法(准确率在51%至73%之间)相比,AllerCatPro的总体准确率为84%。粮农组织/世卫组织规则和AllerCatPro都具有最高的灵敏度,但AllerCatPro的特异性提高了37倍。

可用性和实施方式

https://allercatpro.bii.a-star.edu.sg/。

补充信息

补充数据可在《生物信息学》在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/500c/6736023/3b13c65c283d/btz029f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验