School of Computer, Jiangsu University of Science and Technology, 666 Changhui Road, Zhenjiang 212100, China.
Duke Kunshan University, Duke Avenue, Kunshan, Jiangsu 215316, China.
J Chem Inf Model. 2024 Aug 12;64(15):6216-6229. doi: 10.1021/acs.jcim.4c00739. Epub 2024 Aug 2.
The critical importance of accurately predicting mutations in protein metal-binding sites for advancing drug discovery and enhancing disease diagnostic processes cannot be overstated. In response to this imperative, MetalTrans emerges as an accurate predictor for disease-associated mutations in protein metal-binding sites. The core innovation of MetalTrans lies in its seamless integration of multifeature splicing with the Transformer framework, a strategy that ensures exhaustive feature extraction. Central to MetalTrans's effectiveness is its deep feature combination strategy, which merges evolutionary-scale modeling amino acid embeddings with ProtTrans embeddings, thus shedding light on the biochemical properties of proteins. Employing the Transformer component, MetalTrans leverages the self-attention mechanism to delve into higher-level representations. Utilizing mutation site information for feature fusion not only enriches the feature set but also sidesteps the common pitfall of overestimation linked to protein sequence-based predictions. This nuanced approach to feature fusion is a key differentiator, enabling MetalTrans to outperform existing methods significantly, as evidenced by comparative analyses. Our evaluations across varied metal binding site data sets (specifically Zn, Ca, Mg, and Mix) underscore MetalTrans's superior performance, which achieved the average AUC values of 0.971, 0.965, 0.980, and 0.945 on multiple 5-fold cross-validation, respectively. Remarkably, against the multichannel convolutional neural network method on a benchmark independent test set, MetalTrans demonstrated unparalleled robustness and superiority, boasting the AUC score of 0.998 on multiple 5-fold cross-validation. Our comprehensive examination of the predicted outcomes further confirms the effectiveness of the model. The source codes, data sets, and prediction results for MetalTrans can be accessed for academic usage at https://github.com/EduardWang/MetalTrans.
准确预测蛋白质金属结合位点的突变对于推进药物发现和增强疾病诊断过程的重要性怎么强调都不为过。针对这一需求,MetalTrans 应运而生,是一种用于预测蛋白质金属结合位点疾病相关突变的准确预测器。MetalTrans 的核心创新在于无缝集成了多特征拼接与 Transformer 框架,这种策略可确保详尽的特征提取。MetalTrans 的有效性的核心在于其深度特征组合策略,该策略将进化尺度建模的氨基酸嵌入与 ProtTrans 嵌入相结合,从而揭示了蛋白质的生化特性。MetalTrans 利用 Transformer 组件,利用自注意力机制深入挖掘更高层次的表示。利用突变位点信息进行特征融合不仅丰富了特征集,还避免了基于蛋白质序列预测中常见的高估问题。这种细致入微的特征融合方法是一个关键的区别因素,使 MetalTrans 能够显著优于现有方法,这一点可以通过比较分析得到证明。我们在各种不同的金属结合位点数据集(特别是 Zn、Ca、Mg 和 Mix)上进行的评估突显了 MetalTrans 的卓越性能,在多次 5 倍交叉验证中,它分别实现了平均 AUC 值为 0.971、0.965、0.980 和 0.945。值得注意的是,在基准独立测试集上与多通道卷积神经网络方法相比,MetalTrans 表现出无与伦比的稳健性和优越性,在多次 5 倍交叉验证中 AUC 得分达到 0.998。我们对预测结果的综合检查进一步证实了模型的有效性。MetalTrans 的源代码、数据集和预测结果可在 https://github.com/EduardWang/MetalTrans 上获取,供学术使用。