Université Grenoble Alpes, Inria, CNRS, Grenoble INP, LJK, Grenoble, France.
Bioinformatics. 2019 Sep 15;35(18):3313-3319. doi: 10.1093/bioinformatics/btz122.
Protein model quality assessment (QA) is a crucial and yet open problem in structural bioinformatics. The current best methods for single-model QA typically combine results from different approaches, each based on different input features constructed by experts in the field. Then, the prediction model is trained using a machine-learning algorithm. Recently, with the development of convolutional neural networks (CNN), the training paradigm has changed. In computer vision, the expert-developed features have been significantly overpassed by automatically trained convolutional filters. This motivated us to apply a three-dimensional (3D) CNN to the problem of protein model QA.
We developed Ornate (Oriented Routed Neural network with Automatic Typing)-a novel method for single-model QA. Ornate is a residue-wise scoring function that takes as input 3D density maps. It predicts the local (residue-wise) and the global model quality through a deep 3D CNN. Specifically, Ornate aligns the input density map, corresponding to each residue and its neighborhood, with the backbone topology of this residue. This circumvents the problem of ambiguous orientations of the initial models. Also, Ornate includes automatic identification of atom types and dynamic routing of the data in the network. Established benchmarks (CASP 11 and CASP 12) demonstrate the state-of-the-art performance of our approach among single-model QA methods.
The method is available at https://team.inria.fr/nano-d/software/Ornate/. It consists of a C++ executable that transforms molecular structures into volumetric density maps, and a Python code based on the TensorFlow framework for applying the Ornate model to these maps.
Supplementary data are available at Bioinformatics online.
蛋白质模型质量评估(QA)是结构生物信息学中的一个关键但尚未解决的问题。目前用于单模型 QA 的最佳方法通常结合了来自不同方法的结果,这些方法中的每一种都基于该领域专家构建的不同输入特征。然后,使用机器学习算法对预测模型进行训练。最近,随着卷积神经网络(CNN)的发展,训练范式发生了变化。在计算机视觉中,专家开发的特征已经被自动训练的卷积滤波器显著超越。这促使我们将三维(3D)CNN 应用于蛋白质模型 QA 问题。
我们开发了 Ornate(具有自动分类功能的定向布线神经网络)——一种用于单模型 QA 的新方法。Ornate 是一种残基评分函数,它将 3D 密度图作为输入。它通过深度 3D CNN 预测局部(残基)和全局模型质量。具体来说,Ornate 将输入密度图(对应于每个残基及其邻域)与该残基的骨干拓扑结构对齐。这解决了初始模型方向不明确的问题。此外,Ornate 还包括自动识别原子类型和网络中数据的动态路由。已建立的基准(CASP 11 和 CASP 12)证明了我们的方法在单模型 QA 方法中的最新性能。
该方法可在 https://team.inria.fr/nano-d/software/Ornate/ 获得。它由一个将分子结构转换为体积密度图的 C++可执行文件和一个基于 TensorFlow 框架的 Python 代码组成,用于将 Ornate 模型应用于这些地图。
补充数据可在 Bioinformatics 在线获得。