Yadalam Pradeep Kumar, Ramadoss Ramya, Anegundi Raghavendra Vamsi
Periodontics, Saveetha Dental College, Saveetha Institue of Medical and Technical Sciences (SIMATS) Deemed University, Chennai, IND.
Oral Pathology and Oral Biology, Saveetha Dental College, Saveetha Institue of Medical and Technical Sciences (SIMATS) Deemed University, Chennai, IND.
Cureus. 2024 Sep 7;16(9):e68849. doi: 10.7759/cureus.68849. eCollection 2024 Sep.
Introduction Beta (β)-catenin, a pivotal protein in bone development and homeostasis, is implicated in various bone disorders. Peptide-based therapeutics offer a promising approach due to their specificity and potential for reduced side effects. Attention networks are widely used for peptide sequence prediction, specifically sequence-to-sequence models. Hence, the current study aims to develop a HyperAttention and informatics-based β-catenin sequence prediction for bone formation. Methods β-catenin protein sequences were downloaded and quality-checked using UniProt and FASTA sequences using DeepBio (Deep Bio Inc., Seoul, South Korea) for predictive analysis. Data was analyzed for duplicates, outliers, and missing values. The data was then split into training and testing sets, with 80% of the data used for training and 20% for testing, and peptide sequences were encoded and subjected to algorithms. Results The HyperAttention and Linformer models perform well in predictive sequence, with HyperAttention correctly predicting 87% of instances and Linformer predicting 89%. Both models have higher sensitivity and specificity, with Linformer showing better identification of 91% of negative instances and slightly better sensitivity. Conclusion The HyperAttention and Linformer models effectively predict peptide sequences with high specificity and sensitivity. Further optimization and development are needed for optimal application and balance between positive and negative instances.
引言
β-连环蛋白是骨骼发育和内环境稳定中的一种关键蛋白质,与多种骨骼疾病有关。基于肽的疗法因其特异性和潜在的副作用减少而提供了一种有前景的方法。注意力网络被广泛用于肽序列预测,特别是序列到序列模型。因此,当前的研究旨在开发一种基于超注意力和信息学的β-连环蛋白序列预测方法用于骨形成。
方法
下载β-连环蛋白蛋白质序列,并使用UniProt和FASTA序列通过DeepBio(韩国首尔Deep Bio公司)进行质量检查以进行预测分析。对数据进行重复项、异常值和缺失值分析。然后将数据分为训练集和测试集,80%的数据用于训练,20%用于测试,对肽序列进行编码并应用算法。
结果
超注意力模型和线性former模型在预测序列方面表现良好,超注意力模型正确预测了87%的实例,线性former模型预测了89%。两个模型都具有较高的敏感性和特异性,线性former模型对91%的阴性实例具有更好的识别能力,敏感性略高。
结论
超注意力模型和线性former模型能够以高特异性和敏感性有效地预测肽序列。为了实现最佳应用以及正负实例之间的平衡,还需要进一步优化和开发。