IBM Research UK, Hartree Centre, Daresbury WA4 4AD, United Kingdom.
STFC Daresbury Laboratories, Daresbury WA4 4AD, United Kingdom.
J Chem Phys. 2018 Jun 28;148(24):241744. doi: 10.1063/1.5027261.
Simulation and data analysis have evolved into powerful methods for discovering and understanding molecular modes of action and designing new compounds to exploit these modes. The combination provides a strong impetus to create and exploit new tools and techniques at the interfaces between physics, biology, and data science as a pathway to new scientific insight and accelerated discovery. In this context, we explore the rational design of novel antimicrobial peptides (short protein sequences exhibiting broad activity against multiple species of bacteria). We show how datasets can be harvested to reveal features which inform new design concepts. We introduce new analysis and visualization tools: a graphical representation of the k-mer spectrum as a fundamental property encoded in antimicrobial peptide databases and a data-driven representation to illustrate membrane binding and permeation of helical peptides.
模拟和数据分析已经发展成为发现和理解分子作用模式以及设计利用这些模式的新化合物的强大方法。这种组合为在物理、生物和数据科学之间的界面上创造和利用新工具和技术提供了强大的动力,以此作为获得新科学洞察力和加速发现的途径。在这种情况下,我们探索了新型抗菌肽(对多种细菌具有广泛活性的短蛋白质序列)的合理设计。我们展示了如何收集数据集以揭示提供新设计概念信息的特征。我们引入了新的分析和可视化工具:将 k-mer 谱表示为编码在抗菌肽数据库中的基本性质的图形表示,以及用于说明螺旋肽的膜结合和渗透的基于数据的表示。