Suppr超能文献

挖掘异质特征以提高肽状态(淀粉样变性或非淀粉样变性)的计算预测。

Exploiting heterogeneous features to improve in silico prediction of peptide status - amyloidogenic or non-amyloidogenic.

机构信息

Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal University, Karnataka, India.

出版信息

BMC Bioinformatics. 2011;12 Suppl 13(Suppl 13):S21. doi: 10.1186/1471-2105-12-S13-S21. Epub 2011 Nov 30.

Abstract

BACKGROUND

Prediction of short stretches in protein sequences capable of forming amyloid-like fibrils is important in understanding the underlying cause of amyloid illnesses thereby aiding in the discovery of sequence-targeted anti-aggregation pharmaceuticals. Due to the constraints of experimental molecular techniques in identifying such motif segments, it is highly desirable to develop computational methods to provide better and affordable in silico predictions.

RESULTS

Accurate in silico prediction techniques of amyloidogenic peptide regions rely on the cooperation between informative features and classifier design. In this research article, we propose one such efficient fibril prediction implementation exploiting heterogeneous features based on bio-physio-chemical (BPC) properties, auto-correlation function of carefully selected amino acid indices and atomic composition within a protein fragment of amino acids in a window. In an attempt to get an optimal number of BPC features, an evolutionary Support Vector Machine (SVM) integrating a novel implementation of hybrid Genetic Algorithm termed Memetic Algorithm and SVM is utilized. Five prediction modules designed using Artificial Neural Network (ANN) models are trained with independent and integrated features in order to validate the fibril forming motifs. The results provide evidence that incorporating new feature namely auto-correlation function besides BPC, attempt to strengthen the sequence interaction effect in forming the feature vector thereby obtaining better prediction quality in terms of sensitivity, specificity, Mathews Correlation Coefficient and Area under the Receiver Operating Characteristics curve.

CONCLUSION

A significant improvement in performance is observed by introducing features like auto-correlation function that maintains sequence order effect, in addition to the conventional BPC properties selected through a novel optimization strategy to predict the peptide status - amyloidogenic or non-amyloidogenic. The proposed approach achieves acceptable results, comparable to most online predictors. Besides, it compensates the lacuna in existing amyloid fibril prediction tools by maintaining equilibrium between sensitivity and specificity.

摘要

背景

预测能够形成类淀粉样纤维的蛋白质短序列对于理解淀粉样疾病的根本原因非常重要,从而有助于发现针对序列的抗聚集药物。由于实验分子技术在识别这种模体片段方面的限制,因此非常需要开发计算方法来提供更好的、经济实惠的计算预测。

结果

淀粉样肽区域的准确计算预测技术依赖于信息特征和分类器设计的协作。在本研究文章中,我们提出了一种基于生物物理化学(BPC)特性、精心选择的氨基酸指数的自相关函数以及蛋白质片段中氨基酸的原子组成的异构特征的有效纤维预测实现。为了获得最佳数量的 BPC 特征,我们使用一种新颖的遗传算法混合实现——遗传算法和支持向量机(Memetic Algorithm 和 SVM)集成的进化支持向量机(SVM)。五个使用人工神经网络(ANN)模型设计的预测模块使用独立和集成的特征进行训练,以验证纤维形成基序。结果表明,除了 BPC 之外,引入新特征,如自相关函数,尝试加强形成特征向量的序列相互作用效应,从而在敏感性、特异性、马修斯相关系数和接收器工作特征曲线下面积方面获得更好的预测质量。

结论

通过引入像自相关函数这样的特征,除了通过新颖的优化策略选择的常规 BPC 特性之外,还可以维持序列顺序效应,从而显著提高性能。该方法的预测效果与大多数在线预测器相当,同时通过在敏感性和特异性之间取得平衡,弥补了现有淀粉样纤维预测工具的空白。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b8f/3278838/4d4aac5c30ad/1471-2105-12-S13-S21-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验