Lee Ernest Y, Fulan Benjamin M, Wong Gerard C L, Ferguson Andrew L
Department of Bioengineering, University of California, Los Angeles, CA 90095.
Department of Mathematics, University of Illinois at Urbana-Champaign, Urbana, IL 61801.
Proc Natl Acad Sci U S A. 2016 Nov 29;113(48):13588-13593. doi: 10.1073/pnas.1609893113. Epub 2016 Nov 14.
There are some ∼1,100 known antimicrobial peptides (AMPs), which permeabilize microbial membranes but have diverse sequences. Here, we develop a support vector machine (SVM)-based classifier to investigate ⍺-helical AMPs and the interrelated nature of their functional commonality and sequence homology. SVM is used to search the undiscovered peptide sequence space and identify Pareto-optimal candidates that simultaneously maximize the distance σ from the SVM hyperplane (thus maximize its "antimicrobialness") and its ⍺-helicity, but minimize mutational distance to known AMPs. By calibrating SVM machine learning results with killing assays and small-angle X-ray scattering (SAXS), we find that the SVM metric σ correlates not with a peptide's minimum inhibitory concentration (MIC), but rather its ability to generate negative Gaussian membrane curvature. This surprising result provides a topological basis for membrane activity common to AMPs. Moreover, we highlight an important distinction between the maximal recognizability of a sequence to a trained AMP classifier (its ability to generate membrane curvature) and its maximal antimicrobial efficacy. As mutational distances are increased from known AMPs, we find AMP-like sequences that are increasingly difficult for nature to discover via simple mutation. Using the sequence map as a discovery tool, we find a unexpectedly diverse taxonomy of sequences that are just as membrane-active as known AMPs, but with a broad range of primary functions distinct from AMP functions, including endogenous neuropeptides, viral fusion proteins, topogenic peptides, and amyloids. The SVM classifier is useful as a general detector of membrane activity in peptide sequences.
已知约有1100种抗菌肽(AMPs),它们能使微生物膜通透,但序列多样。在此,我们开发了一种基于支持向量机(SVM)的分类器,以研究α - 螺旋抗菌肽及其功能共性和序列同源性之间的相互关系。支持向量机用于搜索未发现的肽序列空间,并识别帕累托最优候选序列,这些序列能同时最大化与支持向量机超平面的距离σ(从而最大化其“抗菌性”)及其α - 螺旋性,但使与已知抗菌肽的突变距离最小化。通过用杀菌试验和小角X射线散射(SAXS)校准支持向量机的机器学习结果,我们发现支持向量机指标σ与肽的最低抑菌浓度(MIC)无关,而是与其产生负高斯膜曲率的能力相关。这一惊人结果为抗菌肽共有的膜活性提供了拓扑学基础。此外,我们强调了序列对训练有素的抗菌肽分类器的最大可识别性(其产生膜曲率的能力)与其最大抗菌功效之间的重要区别。随着与已知抗菌肽的突变距离增加,我们发现类似抗菌肽的序列越来越难以通过简单突变被自然发现。利用序列图谱作为发现工具,我们发现了一类出乎意料的多样化序列分类,它们与已知抗菌肽一样具有膜活性,但具有广泛的不同于抗菌肽功能的主要功能,包括内源性神经肽、病毒融合蛋白、拓扑肽和淀粉样蛋白。支持向量机分类器可作为肽序列中膜活性的通用检测器。