Ekenna Chinwe, Thomas Shawna, Amato Nancy M
Department of Computer Science and Engineering, Texas A&M University, College Station, 77843, TX, USA.
BMC Syst Biol. 2016 Aug 1;10 Suppl 2(Suppl 2):49. doi: 10.1186/s12918-016-0297-9.
Simulating protein folding motions is an important problem in computational biology. Motion planning algorithms, such as Probabilistic Roadmap Methods, have been successful in modeling the folding landscape. Probabilistic Roadmap Methods and variants contain several phases (i.e., sampling, connection, and path extraction). Most of the time is spent in the connection phase and selecting which variant to employ is a difficult task. Global machine learning has been applied to the connection phase but is inefficient in situations with varying topology, such as those typical of folding landscapes.
We develop a local learning algorithm that exploits the past performance of methods within the neighborhood of the current connection attempts as a basis for learning. It is sensitive not only to different types of landscapes but also to differing regions in the landscape itself, removing the need to explicitly partition the landscape. We perform experiments on 23 proteins of varying secondary structure makeup with 52-114 residues. We compare the success rate when using our methods and other methods. We demonstrate a clear need for learning (i.e., only learning methods were able to validate against all available experimental data) and show that local learning is superior to global learning producing, in many cases, significantly higher quality results than the other methods.
We present an algorithm that uses local learning to select appropriate connection methods in the context of roadmap construction for protein folding. Our method removes the burden of deciding which method to use, leverages the strengths of the individual input methods, and it is extendable to include other future connection methods.
模拟蛋白质折叠运动是计算生物学中的一个重要问题。运动规划算法,如概率地图法,已成功用于构建折叠态势模型。概率地图法及其变体包含几个阶段(即采样、连接和路径提取)。大部分时间都花在连接阶段,选择使用哪种变体是一项艰巨的任务。全局机器学习已应用于连接阶段,但在拓扑结构变化的情况下效率低下,例如折叠态势中常见的情况。
我们开发了一种局部学习算法,该算法利用当前连接尝试邻域内方法的过去性能作为学习基础。它不仅对不同类型的态势敏感,而且对态势本身的不同区域也敏感,无需明确划分态势。我们对23种具有不同二级结构组成、含52 - 114个残基的蛋白质进行了实验。我们比较了使用我们的方法和其他方法时的成功率。我们证明了学习的明确需求(即只有学习方法能够针对所有可用实验数据进行验证),并表明局部学习优于全局学习,在许多情况下产生的结果质量明显高于其他方法。
我们提出了一种算法,该算法在蛋白质折叠的路线图构建中使用局部学习来选择合适的连接方法。我们的方法消除了决定使用哪种方法的负担,利用了各个输入方法的优势,并且可扩展以纳入未来的其他连接方法。