Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, 15260, USA.
Sci Rep. 2024 Sep 5;14(1):20722. doi: 10.1038/s41598-024-71699-3.
We here introduce Ensemble Optimizer (EnOpt), a machine-learning tool to improve the accuracy and interpretability of ensemble virtual screening (VS). Ensemble VS is an established method for predicting protein/small-molecule (ligand) binding. Unlike traditional VS, which focuses on a single protein conformation, ensemble VS better accounts for protein flexibility by predicting binding to multiple protein conformations. Each compound is thus associated with a spectrum of scores (one score per protein conformation) rather than a single score. To effectively rank and prioritize the molecules for further evaluation (including experimental testing), researchers must select which protein conformations to consider and how best to map each compound's spectrum of scores to a single value, decisions that are system-specific. EnOpt uses machine learning to address these challenges. We perform benchmark VS to show that for many systems, EnOpt ranking distinguishes active compounds from inactive or decoy molecules more effectively than traditional ensemble VS methods. To encourage broad adoption, we release EnOpt free of charge under the terms of the MIT license.
我们在这里介绍 Ensemble Optimizer(EnOpt),这是一种机器学习工具,可提高集成虚拟筛选(VS)的准确性和可解释性。集成 VS 是一种用于预测蛋白质/小分子(配体)结合的成熟方法。与专注于单个蛋白质构象的传统 VS 不同,集成 VS 通过预测与多个蛋白质构象的结合更好地考虑了蛋白质的灵活性。因此,每个化合物都与一系列分数(每个蛋白质构象一个分数)相关联,而不是单个分数。为了有效地对分子进行排名和优先级排序以进行进一步评估(包括实验测试),研究人员必须选择要考虑的蛋白质构象以及如何将每个化合物的分数谱最佳地映射到单个值,这些决策是特定于系统的。EnOpt 使用机器学习来解决这些挑战。我们进行基准 VS 表明,对于许多系统,EnOpt 排序比传统的集成 VS 方法更有效地将活性化合物与非活性或诱饵分子区分开来。为了鼓励广泛采用,我们根据麻省理工学院的许可条款免费发布 EnOpt。