BMC Bioinformatics. 2014;15 Suppl 12(Suppl 12):S5. doi: 10.1186/1471-2105-15-S12-S5. Epub 2014 Nov 6.
The accurate packing of protein side chains is important for many computational biology problems, such as ab initio protein structure prediction, homology modelling, and protein design and ligand docking applications. Many of existing solutions are modelled as a computational optimisation problem. As well as the design of search algorithms, most solutions suffer from an inaccurate energy function for judging whether a prediction is good or bad. Even if the search has found the lowest energy, there is no certainty of obtaining the protein structures with correct side chains.
We present a side-chain modelling method, pacoPacker, which uses a parallel ant colony optimisation strategy based on sharing a single pheromone matrix. This parallel approach combines different sources of energy functions and generates protein side-chain conformations with the lowest energies jointly determined by the various energy functions. We further optimised the selected rotamers to construct subrotamer by rotamer minimisation, which reasonably improved the discreteness of the rotamer library.
We focused on improving the accuracy of side-chain conformation prediction. For a testing set of 442 proteins, 87.19% of X1 and 77.11% of X12 angles were predicted correctly within 40° of the X-ray positions. We compared the accuracy of pacoPacker with state-of-the-art methods, such as CIS-RR and SCWRL4. We analysed the results from different perspectives, in terms of protein chain and individual residues. In this comprehensive benchmark testing, 51.5% of proteins within a length of 400 amino acids predicted by pacoPacker were superior to the results of CIS-RR and SCWRL4 simultaneously. Finally, we also showed the advantage of using the subrotamers strategy. All results confirmed that our parallel approach is competitive to state-of-the-art solutions for packing side chains.
This parallel approach combines various sources of searching intelligence and energy functions to pack protein side chains. It provides a frame-work for combining different inaccuracy/usefulness objective functions by designing parallel heuristic search algorithms.
准确地对蛋白质侧链进行包装对于许多计算生物学问题都很重要,例如从头蛋白质结构预测、同源建模、蛋白质设计和配体对接应用。许多现有的解决方案都被建模为计算优化问题。除了搜索算法的设计外,大多数解决方案都存在判断预测好坏的能量函数不准确的问题。即使搜索已经找到了最低能量,也不能保证获得具有正确侧链的蛋白质结构。
我们提出了一种侧链建模方法 pacoPacker,它使用基于共享单个信息素矩阵的并行蚁群优化策略。这种并行方法结合了不同的能量函数来源,并生成了由各种能量函数共同确定的最低能量的蛋白质侧链构象。我们进一步优化了选择的构象异构体,通过构象异构体最小化来构建亚构象异构体,这合理地提高了构象异构体库的离散性。
我们专注于提高侧链构象预测的准确性。对于 442 个蛋白质的测试集,12 个 X 角中的 87.19%和 X1 角中的 77.11%被预测在与 X 射线位置相差 40°以内。我们将 pacoPacker 的准确性与最先进的方法(如 CIS-RR 和 SCWRL4)进行了比较。我们从不同的角度分析了结果,包括蛋白质链和单个残基。在这个全面的基准测试中,在长度为 400 个氨基酸的蛋白质中,51.5%的蛋白质预测结果优于 CIS-RR 和 SCWRL4 的结果。最后,我们还展示了使用亚构象异构体策略的优势。所有结果都证实了我们的并行方法在包装蛋白质侧链方面具有竞争力。
这种并行方法结合了各种搜索智能和能量函数来包装蛋白质侧链。它通过设计并行启发式搜索算法,为结合不同的不准确性/有用性目标函数提供了一个框架。