Department of Computer Science, Saint Louis University, St. Louis, Missouri.
Department of Mathematics and Computer Science, University of Missouri-St. Louis, St. Louis, Missouri.
Proteins. 2020 Jun;88(6):775-787. doi: 10.1002/prot.25865. Epub 2019 Dec 27.
Many proteins are composed of several domains that pack together into a complex tertiary structure. Multidomain proteins can be challenging for protein structure modeling, particularly those for which templates can be found for individual domains but not for the entire sequence. In such cases, homology modeling can generate high quality models of the domains but not for the orientations between domains. Small-angle X-ray scattering (SAXS) reports the structural properties of entire proteins and has the potential for guiding homology modeling of multidomain proteins. In this article, we describe a novel multidomain protein assembly modeling method, SAXSDom that integrates experimental knowledge from SAXS with probabilistic Input-Output Hidden Markov model to assemble the structures of individual domains together. Four SAXS-based scoring functions were developed and tested, and the method was evaluated on multidomain proteins from two public datasets. Incorporation of SAXS information improved the accuracy of domain assembly for 40 out of 46 critical assessment of protein structure prediction multidomain protein targets and 45 out of 73 multidomain protein targets from the ab initio domain assembly dataset. The results demonstrate that SAXS data can provide useful information to improve the accuracy of domain-domain assembly. The source code and tool packages are available at https://github.com/jianlin-cheng/SAXSDom.
许多蛋白质由几个结构域组成,这些结构域组合成一个复杂的三级结构。多结构域蛋白质的蛋白质结构建模具有挑战性,特别是对于那些可以找到单个结构域模板但找不到整个序列模板的蛋白质。在这种情况下,同源建模可以生成结构域的高质量模型,但不能生成结构域之间的方向模型。小角 X 射线散射 (SAXS) 报告整个蛋白质的结构特性,并有潜力指导多结构域蛋白质的同源建模。在本文中,我们描述了一种新颖的多结构域蛋白质组装建模方法 SAXSDom,该方法将 SAXS 的实验知识与概率输入-输出隐马尔可夫模型相结合,将各个结构域的结构组装在一起。开发并测试了四个基于 SAXS 的评分函数,并在来自两个公共数据集的多结构域蛋白质上评估了该方法。对于 46 个关键评估蛋白质结构预测的多结构域蛋白质目标中的 40 个和从头开始的结构域组装数据集的 73 个多结构域蛋白质目标中的 45 个,SAXS 信息的纳入提高了结构域组装的准确性。结果表明,SAXS 数据可以提供有用的信息,以提高结构域-结构域组装的准确性。源代码和工具包可在 https://github.com/jianlin-cheng/SAXSDom 上获得。