Instituto de Telecomunicações, Delegação da Covilhã, Covilhã, Portugal.
Universidade da Beira Interior, Departamento de Informática, Covilhã, Portugal.
PLoS One. 2019 Oct 14;14(10):e0223596. doi: 10.1371/journal.pone.0223596. eCollection 2019.
Extensive research has been applied to discover new techniques and methods to model protein-ligand interactions. In particular, considerable efforts focused on identifying candidate binding sites, which quite often are active sites that correspond to protein pockets or cavities. Thus, these cavities play an important role in molecular docking. However, there is no established benchmark to assess the accuracy of new cavity detection methods. In practice, each new technique is evaluated using a small set of proteins with known binding sites as ground-truth. However, studies supported by large datasets of known cavities and/or binding sites and statistical classification (i.e., false positives, false negatives, true positives, and true negatives) would yield much stronger and reliable assessments. To this end, we propose CavBench, a generic and extensible benchmark to compare different cavity detection methods relative to diverse ground truth datasets (e.g., PDBsum) using statistical classification methods.
已经进行了广泛的研究来发现新的技术和方法来模拟蛋白质-配体相互作用。特别是,已经投入了相当大的努力来确定候选结合位点,这些结合位点通常是对应于蛋白质口袋或腔的活性位点。因此,这些腔在分子对接中起着重要作用。然而,目前还没有评估新的腔检测方法准确性的既定基准。在实践中,每个新的技术都是使用一组具有已知结合位点的小蛋白质集作为真实值进行评估的。然而,基于已知腔和/或结合位点以及统计分类(即假阳性、假阴性、真阳性和真阴性)的大型数据集的研究将产生更强有力和可靠的评估。为此,我们提出了 CavBench,这是一种通用且可扩展的基准,可以使用统计分类方法来比较不同的腔检测方法相对于不同的真实数据集(例如,PDBsum)。