Schneider Michael, Brock Oliver
Robotics and Biology Laboratory, Department of Electrical Engineering and Computer Science, Technische Universität Berlin, Berlin, Germany.
PLoS One. 2014 Oct 22;9(10):e108438. doi: 10.1371/journal.pone.0108438. eCollection 2014.
We introduce a novel contact prediction method that achieves high prediction accuracy by combining evolutionary and physicochemical information about native contacts. We obtain evolutionary information from multiple-sequence alignments and physicochemical information from predicted ab initio protein structures. These structures represent low-energy states in an energy landscape and thus capture the physicochemical information encoded in the energy function. Such low-energy structures are likely to contain native contacts, even if their overall fold is not native. To differentiate native from non-native contacts in those structures, we develop a graph-based representation of the structural context of contacts. We then use this representation to train an support vector machine classifier to identify most likely native contacts in otherwise non-native structures. The resulting contact predictions are highly accurate. As a result of combining two sources of information--evolutionary and physicochemical--we maintain prediction accuracy even when only few sequence homologs are present. We show that the predicted contacts help to improve ab initio structure prediction. A web service is available at http://compbio.robotics.tu-berlin.de/epc-map/.
我们介绍了一种新颖的接触预测方法,该方法通过结合有关天然接触的进化信息和物理化学信息来实现高预测准确性。我们从多序列比对中获取进化信息,并从预测的从头算蛋白质结构中获取物理化学信息。这些结构代表能量景观中的低能状态,因此捕获了能量函数中编码的物理化学信息。即使这些结构的整体折叠不是天然的,这种低能结构也可能包含天然接触。为了区分这些结构中的天然接触和非天然接触,我们开发了一种基于图的接触结构上下文表示。然后,我们使用这种表示来训练支持向量机分类器,以识别非天然结构中最可能的天然接触。由此产生的接触预测非常准确。由于结合了进化和物理化学这两种信息来源,即使只有很少的序列同源物存在,我们也能保持预测准确性。我们表明,预测的接触有助于改进从头算结构预测。可通过http://compbio.robotics.tu-berlin.de/epc-map/获得网络服务。