College of Information and Computer Engineering, Northeast Forestry University, Harbin, China.
BMC Bioinformatics. 2023 Apr 7;24(1):137. doi: 10.1186/s12859-023-05257-5.
Vesicle transport proteins not only play an important role in the transmembrane transport of molecules, but also have a place in the field of biomedicine, so the identification of vesicle transport proteins is particularly important. We propose a method based on ensemble learning and evolutionary information to identify vesicle transport proteins. Firstly, we preprocess the imbalanced dataset by random undersampling. Secondly, we extract position-specific scoring matrix (PSSM) from protein sequences, and then further extract AADP-PSSM and RPSSM features from PSSM, and use the Max-Relevance-Max-Distance (MRMD) algorithm to select the optimal feature subset. Finally, the optimal feature subset is fed into the stacked classifier for vesicle transport proteins identification. The experimental results show that the of accuracy (ACC), sensitivity (SN) and specificity (SP) of our method on the independent testing set are 82.53%, 0.774 and 0.836, respectively. The SN, SP and ACC of our proposed method are 0.013, 0.007 and 0.76% higher than the current state-of-the-art methods.
囊泡转运蛋白不仅在分子的跨膜运输中起着重要作用,而且在生物医学领域也有一席之地,因此囊泡转运蛋白的鉴定尤为重要。我们提出了一种基于集成学习和进化信息的方法来识别囊泡转运蛋白。首先,我们通过随机欠采样对不平衡数据集进行预处理。其次,我们从蛋白质序列中提取位置特异性评分矩阵 (PSSM),然后进一步从 PSSM 中提取 AADP-PSSM 和 RPSSM 特征,并使用最大相关性最大距离 (MRMD) 算法选择最佳特征子集。最后,将最佳特征子集输入堆叠分类器以识别囊泡转运蛋白。实验结果表明,我们的方法在独立测试集上的准确率 (ACC)、灵敏度 (SN) 和特异性 (SP) 分别为 82.53%、0.774 和 0.836。与当前最先进的方法相比,我们提出的方法的 SN、SP 和 ACC 分别高 0.013、0.007 和 0.76%。