CONCEPT Lab, Istituto Italiano di Tecnologia, Via Melen - 83, B Block, 16152 Genova, Italy.
J Chem Theory Comput. 2023 Aug 8;19(15):5242-5259. doi: 10.1021/acs.jctc.2c01306. Epub 2023 Jul 20.
We present a novel method for the automatic detection of pockets on protein molecular surfaces. The algorithm is based on an ad hoc hierarchical clustering of virtual probe spheres obtained from the geometrical primitives used by the NanoShaper software to build the solvent-excluded molecular surface. The final ranking of putative pockets is based on the Isolation Forest method, an unsupervised learning approach originally developed for anomaly detection. A detailed importance analysis of pocket features provides insight into which geometrical (clustering) and chemical (amino acidic composition) properties characterize a good binding site. The method also provides a segmentation of pockets into smaller subpockets. We prove that subpockets are a convenient representation to pinpoint the binding site with great precision. SiteFerret is outstanding in its versatility, accurately predicting a wide range of binding sites, from those binding small molecules to those binding peptides, including difficult shallow sites.
我们提出了一种新的方法,用于自动检测蛋白质分子表面上的口袋。该算法基于从 NanoShaper 软件用于构建溶剂排除分子表面的几何基元获得的虚拟探针球的特定层次聚类。假定口袋的最终排序基于隔离森林方法,这是一种最初为异常检测开发的无监督学习方法。口袋特征的详细重要性分析提供了有关哪些几何(聚类)和化学(氨基酸组成)特性可以表征良好的结合位点的深入了解。该方法还可以将口袋分割成更小的子口袋。我们证明了子口袋是一种方便的表示形式,可以非常精确地确定结合位点。SiteFerret 在多功能性方面表现出色,能够准确预测广泛的结合位点,包括从小分子结合到肽结合,甚至包括困难的浅层结合位点。