The Ph.D. Program of Biotechnology and Biomedical Industry, China Medical University, Taichung, Taiwan.
Department of Information Engineering and Computer Science, Feng Chia University, Taichung, Taiwan.
Sci Rep. 2021 Jun 30;11(1):13599. doi: 10.1038/s41598-021-92793-w.
Single amino acid variation (SAV) is an amino acid substitution of the protein sequence that can potentially influence the entire protein structure or function, as well as its binding affinity. Protein destabilization is related to diseases, including several cancers, although using traditional experiments to clarify the relationship between SAVs and cancer uses much time and resources. Some SAV prediction methods use computational approaches, with most predicting SAV-induced changes in protein stability. In this investigation, all SAV characteristics generated from protein sequences, structures and the microenvironment were converted into feature vectors and fed into an integrated predicting system using a support vector machine and genetic algorithm. Critical features were used to estimate the relationship between their properties and cancers caused by SAVs. We describe how we developed a prediction system based on protein sequences and structure that is capable of distinguishing if the SAV is related to cancer or not. The five-fold cross-validation performance of our system is 89.73% for the accuracy, 0.74 for the Matthews correlation coefficient, and 0.81 for the F1 score. We have built an online prediction server, CanSavPre ( http://bioinfo.cmu.edu.tw/CanSavPre/ ), which is expected to become a useful, practical tool for cancer research and precision medicine.
单氨基酸变异 (SAV) 是蛋白质序列中的氨基酸取代,可能会影响整个蛋白质结构或功能以及其结合亲和力。蛋白质的不稳定性与疾病有关,包括几种癌症,尽管使用传统实验来阐明 SAV 与癌症之间的关系需要耗费大量的时间和资源。一些 SAV 预测方法使用计算方法,其中大多数预测 SAV 引起的蛋白质稳定性变化。在这项研究中,从蛋白质序列、结构和微环境中生成的所有 SAV 特征都被转换为特征向量,并使用支持向量机和遗传算法集成到一个预测系统中。关键特征被用于估计它们的性质与由 SAV 引起的癌症之间的关系。我们描述了如何开发一个基于蛋白质序列和结构的预测系统,该系统能够区分 SAV 是否与癌症有关。我们系统的五重交叉验证性能为准确率 89.73%,马修斯相关系数 0.74,F1 得分为 0.81。我们已经建立了一个在线预测服务器 CanSavPre(http://bioinfo.cmu.edu.tw/CanSavPre/),预计它将成为癌症研究和精准医学的有用实用工具。