IEEE/ACM Trans Comput Biol Bioinform. 2022 Jan-Feb;19(1):295-304. doi: 10.1109/TCBB.2020.3005813. Epub 2022 Feb 3.
Alpha-helical proteins ( αTMPs) are essential in various biological processes. Despite their tertiary structures are crucial for revealing complex functions, experimental structure determination remains challenging and costly. In the past decades, various sequence-based topology prediction methods have been developed to bridge the gap between the sequences and structures by characterizing the structural features, but significant improvements are still required. Deep learning brings a great opportunity for its powerful representation learning capability from limited original data. In this work, we improved our αTMP topology prediction method DMCTOP using deep learning, which composed of two deep convolutional blocks to simultaneously extract local and global contextual features. Consequently, the inputs were simplified to reflect the original features of the sequence, including a protein sequence feature and an evolutionary conservation feature. DMCTOP can efficiently and accurately identify all topological types and the N-terminal orientation for an αTMP sequence. To validate the effectiveness of our method, we benchmarked DMCTOP against 13 peer methods according to the whole sequence, the transmembrane segment and the traditional criterion in testing experiments. All the results reveal that our method achieved the highest prediction accuracy and outperformed all the previous methods. The method is available at https://icdtools.nenu.edu.cn/dmctop.
α 螺旋蛋白(αTMPs)在各种生物过程中是必不可少的。尽管它们的三级结构对于揭示复杂的功能至关重要,但实验确定结构仍然具有挑战性和昂贵。在过去的几十年中,已经开发了各种基于序列的拓扑预测方法,通过表征结构特征来弥合序列和结构之间的差距,但仍需要显著改进。深度学习为从有限的原始数据中进行强大的表示学习提供了很好的机会。在这项工作中,我们使用深度学习改进了我们的αTMP 拓扑预测方法 DMCTOP,该方法由两个深度卷积块组成,以同时提取局部和全局上下文特征。因此,输入被简化为反映序列的原始特征,包括蛋白质序列特征和进化保守特征。DMCTOP 可以有效地和准确地识别 αTMP 序列的所有拓扑类型和 N 端取向。为了验证我们方法的有效性,我们根据整个序列、跨膜片段和测试实验中的传统标准,将 DMCTOP 与 13 种同类方法进行了基准测试。所有结果表明,我们的方法达到了最高的预测精度,优于所有以前的方法。该方法可在 https://icdtools.nenu.edu.cn/dmctop 上获取。