School of Software Engineering, South China University of Technology, Guangzhou, China.
School of Data Science and Information Engineering, Guizhou Minzu University, Guiyang, China.
BMC Genomics. 2023 Jul 13;24(1):393. doi: 10.1186/s12864-023-09468-1.
Due to the dynamic nature of enhancers, identifying enhancers and their strength are major bioinformatics challenges. With the development of deep learning, several models have facilitated enhancers detection in recent years. However, existing studies either neglect different length motifs information or treat the features at all spatial locations equally. How to effectively use multi-scale motifs information while ignoring irrelevant information is a question worthy of serious consideration. In this paper, we propose an accurate and stable predictor iEnhancer-DCSA, mainly composed of dual-scale fusion and spatial attention, automatically extracting features of different length motifs and selectively focusing on the important features.
Our experimental results demonstrate that iEnhancer-DCSA is remarkably superior to existing state-of-the-art methods on the test dataset. Especially, the accuracy and MCC of enhancer identification are improved by 3.45% and 9.41%, respectively. Meanwhile, the accuracy and MCC of enhancer classification are improved by 7.65% and 18.1%, respectively. Furthermore, we conduct ablation studies to demonstrate the effectiveness of dual-scale fusion and spatial attention.
iEnhancer-DCSA will be a valuable computational tool in identifying and classifying enhancers, especially for those not included in the training dataset.
由于增强子的动态性质,鉴定增强子及其强度是生物信息学的主要挑战。随着深度学习的发展,近年来有几个模型促进了增强子的检测。然而,现有研究要么忽略了不同长度基序的信息,要么平等对待所有空间位置的特征。如何有效地利用多尺度基序信息,同时忽略不相关的信息,是一个值得认真考虑的问题。在本文中,我们提出了一个准确而稳定的预测器 iEnhancer-DCSA,主要由双尺度融合和空间注意力组成,自动提取不同长度基序的特征,并选择性地关注重要特征。
我们的实验结果表明,iEnhancer-DCSA 在测试数据集上明显优于现有的最先进方法。特别是,增强子识别的准确性和 MCC 分别提高了 3.45%和 9.41%。同时,增强子分类的准确性和 MCC 分别提高了 7.65%和 18.1%。此外,我们进行了消融研究,以证明双尺度融合和空间注意力的有效性。
iEnhancer-DCSA 将成为识别和分类增强子的有价值的计算工具,特别是对于那些不在训练数据集中的增强子。