Information Science and Technology College, Dalian Maritime University, Dalian 116026, China.
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac323.
Protein S-sulfinylation is an important posttranslational modification that regulates a variety of cell and protein functions. This modification has been linked to signal transduction, redox homeostasis and neuronal transmission in studies. Therefore, identification of S-sulfinylation sites is crucial to understanding its structure and function, which is critical in cell biology and human diseases. In this study, we propose a multi-module deep learning framework named DLF-Sul for identification of S-sulfinylation sites in proteins. First, three types of features are extracted including binary encoding, BLOSUM62 and amino acid index. Then, sequential features are further extracted based on these three types of features using bidirectional long short-term memory network. Next, multi-head self-attention mechanism is utilized to filter the effective attribute information, and residual connection helps to reduce information loss. Furthermore, convolutional neural network is employed to extract local deep features information. Finally, fully connected layers acts as classifier that map samples to corresponding label. Performance metrics on independent test set, including sensitivity, specificity, accuracy, Matthews correlation coefficient and area under curve, reach 91.80%, 92.36%, 92.08%, 0.8416 and 96.40%, respectively. The results show that DLF-Sul is an effective tool for predicting S-sulfinylation sites. The source code is available on the website https://github.com/ningq669/DLF-Sul.
蛋白质 S-亚磺化是一种重要的翻译后修饰,调节多种细胞和蛋白质功能。在研究中,这种修饰与信号转导、氧化还原平衡和神经元传递有关。因此,鉴定 S-亚磺化位点对于了解其结构和功能至关重要,这对于细胞生物学和人类疾病至关重要。在这项研究中,我们提出了一种名为 DLF-Sul 的多模块深度学习框架,用于鉴定蛋白质中的 S-亚磺化位点。首先,提取了三种类型的特征,包括二进制编码、BLOSUM62 和氨基酸指数。然后,使用双向长短期记忆网络进一步基于这三种类型的特征提取序列特征。接下来,利用多头自注意力机制过滤有效属性信息,并使用残差连接减少信息丢失。此外,卷积神经网络用于提取局部深度特征信息。最后,全连接层作为分类器,将样本映射到相应的标签。在独立测试集上的性能指标,包括敏感性、特异性、准确性、马修斯相关系数和曲线下面积,分别达到 91.80%、92.36%、92.08%、0.8416 和 96.40%。结果表明,DLF-Sul 是一种预测 S-亚磺化位点的有效工具。源代码可在网站 https://github.com/ningq669/DLF-Sul 上获得。