School of Internet of Things Engineering, Wuxi City College of Vocational Technology, Wuxi 214153, China.
School of Internet of Things Engineering, Wuxi City College of Vocational Technology, Wuxi 214153, China.
J Theor Biol. 2018 Jun 14;447:147-153. doi: 10.1016/j.jtbi.2018.03.034. Epub 2018 Mar 27.
Presynaptic neurotoxins and postsynaptic neurotoxins are two important neurotoxins isolated from venoms of venomous animals and have been proven to be potential effective in neurosciences and pharmacology. With the number of toxin sequences appeared in the public databases, there was a need for developing a computational method for fast and accurate identification and classification of the novel presynaptic neurotoxins and postsynaptic neurotoxins in the large databases. In this study, the Multinomial Naive Bayes Classifier (MNBC) had been developed to discriminate the presynaptic neurotoxins and postsynaptic neurotoxins based on the different kinds of features. The Minimum Redundancy Maximum Relevance (MRMR) feature selection method was used for ranking 400 pseudo amino acid (PseAA) compositions and 50 top ranked PseAA compositions were selected for improving the prediction results. The motif features, 400 PseAA compositions and 50 PseAA compositions were combined together, and selected as the input parameters of MNBC. The best correlation coefficient (CC) value of 0.8213 was obtained when the prediction quality was evaluated by the jackknife test. It was anticipated that the algorithm presented in this study may become a useful tool for identification of presynaptic neurotoxin and postsynaptic neurotoxin sequences and may provide some useful help for in-depth investigation into the biological mechanism of presynaptic neurotoxins and postsynaptic neurotoxins.
突触前神经毒素和突触后神经毒素是从毒液中分离出来的两种重要神经毒素,已被证明在神经科学和药理学中具有潜在的有效性。随着公共数据库中出现的毒素序列数量的增加,需要开发一种计算方法,以便在大型数据库中快速、准确地识别和分类新型突触前神经毒素和突触后神经毒素。在这项研究中,开发了多项式朴素贝叶斯分类器(MNBC),以基于不同的特征来区分突触前神经毒素和突触后神经毒素。最小冗余最大相关性(MRMR)特征选择方法用于对 400 种伪氨基酸(PseAA)组成进行排序,选择前 50 种 PseAA 组成以提高预测结果。将模体特征、400 种 PseAA 组成和 50 种 PseAA 组成组合在一起,作为 MNBC 的输入参数。通过 Jackknife 测试评估预测质量时,获得了最佳相关系数(CC)值 0.8213。预计本研究提出的算法可能成为识别突触前神经毒素和突触后神经毒素序列的有用工具,并可能为深入研究突触前神经毒素和突触后神经毒素的生物学机制提供一些有用的帮助。