Makigaki Shuichiro, Ishida Takashi
School of Computing, Tokyo Institute of Technology, Tokyo, Japan.
Bio Protoc. 2020 May 5;10(9):e3600. doi: 10.21769/BioProtoc.3600.
Template-based modeling, the process of predicting the tertiary structure of a protein by using homologous protein structures, is useful when good templates can be available. Indeed, modern homology detection methods can find remote homologs with high sensitivity. However, the accuracy of template-based models generated from the homology-detection-based alignments is often lower than that from ideal alignments. In this study, we propose a new method that generates pairwise sequence alignments for more accurate template-based modeling. Our method trains a machine learning model using the structural alignment of known homologs. When calculating sequence alignments, instead of a fixed substitution matrix, this method dynamically predicts a substitution score from the trained model.
基于模板的建模,即通过使用同源蛋白质结构预测蛋白质三级结构的过程,在有良好模板可用时非常有用。实际上,现代同源性检测方法能够以高灵敏度找到远缘同源物。然而,基于同源性检测比对生成的基于模板的模型的准确性通常低于理想比对生成的模型。在本研究中,我们提出了一种新方法,该方法生成成对序列比对以进行更准确的基于模板的建模。我们的方法使用已知同源物的结构比对来训练机器学习模型。在计算序列比对时,该方法不是使用固定的替换矩阵,而是从训练模型动态预测替换分数。