FTWSVM-SR：基于自表示的模糊孪生支持向量机进行 DNA 结合蛋白识别。

FTWSVM-SR: DNA-Binding Proteins Identification via Fuzzy Twin Support Vector Machines on Self-Representation.

机构信息

School of Internet of Things Engineering, Jiangnan University, Wuxi, 214122, People's Republic of China.

Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, 324000, People's Republic of China.

出版信息

Interdiscip Sci. 2022 Jun;14(2):372-384. doi: 10.1007/s12539-021-00489-6. Epub 2021 Nov 6.

DOI:10.1007/s12539-021-00489-6

PMID:34743286

Abstract

Due to the high cost of DNA-binding proteins (DBPs) detection, many machine learning algorithms (ML) have been utilized to large-scale process and detect DBPs. The previous methods took no count of the processing of noise samples. In this study, a fuzzy twin support vector machine (FTWSVM) is employed to detect DBPs. First, multiple types of protein sequence features are formed into kernel matrices; Then, multiple kernel learning (MKL) algorithm is utilized to linear combine multiple kernels; next, self-representation-based membership function is utilized to estimate membership value (weight) of each training sample; finally, we feed the integrated kernel matrix and membership values into the FTWSVM-SR model for training and testing. On comparison with other predictive models, FTWSVM based on SR (FTWSVM-SR) obtains the best performance of Matthew's correlation coefficient (MCC): 0.7410 and 0.5909 on two independent testing sets (PDB186 and PDB2272 datasets), respectively. The results confirm that our method can be an effective DBPs detection tool. Before the biochemical experiment, our model can screen and analyze DBPs on a large scale.

摘要

由于 DNA 结合蛋白（DBP）检测成本高，许多机器学习算法（ML）已被用于大规模处理和检测 DBP。之前的方法没有考虑噪声样本的处理。在这项研究中，采用模糊孪生支持向量机（FTWSVM）来检测 DBP。首先，将多种类型的蛋白质序列特征组合成核矩阵；然后，利用多核学习（MKL）算法对多个核进行线性组合；接下来，利用基于自表示的隶属函数来估计每个训练样本的隶属值（权重）；最后，将集成核矩阵和隶属值输入到 FTWSVM-SR 模型中进行训练和测试。与其他预测模型相比，基于 SR 的 FTWSVM（FTWSVM-SR）在两个独立测试集（PDB186 和 PDB2272 数据集）上分别获得了最佳的马修斯相关系数（MCC）：0.7410 和 0.5909。结果证实，我们的方法可以成为一种有效的 DBP 检测工具。在进行生化实验之前，我们的模型可以大规模筛选和分析 DBP。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

FTWSVM-SR：基于自表示的模糊孪生支持向量机进行 DNA 结合蛋白识别。

FTWSVM-SR: DNA-Binding Proteins Identification via Fuzzy Twin Support Vector Machines on Self-Representation.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

FTWSVM-SR：基于自表示的模糊孪生支持向量机进行 DNA 结合蛋白识别。

FTWSVM-SR: DNA-Binding Proteins Identification via Fuzzy Twin Support Vector Machines on Self-Representation.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献