TUM School of Natural Sciences, Department of Bioscience, Center for Functional Protein Assemblies (CPA), Technical University of Munich, 85748 Garching bei München, Germany.
J Chem Inf Model. 2024 Jun 24;64(12):4640-4650. doi: 10.1021/acs.jcim.4c00765. Epub 2024 Jun 5.
The precise prediction of molecular properties can greatly accelerate the development of new drugs. However, molecular property prediction approaches have been limited so far to assays for which large amounts of data are available. In this study, we develop a new computational approach leveraging both the textual description of the assay of interest and the chemical structure of target compounds. By combining these two sources of information via self-supervised learning, our tool can provide accurate predictions for assays where no measurements are available. Remarkably, our approach achieves state-of-the-art performance on the FS-Mol benchmark for zero-shot prediction, outperforming a wide variety of deep learning approaches. Additionally, we demonstrate how our tool can be used for tailoring screening libraries for the assay of interest, showing promising performance in a retrospective case study on a high-throughput screening campaign. By accelerating the early identification of active molecules in drug discovery and development, this method has the potential to streamline the identification of novel therapeutics.
精确预测分子性质可以极大地加速新药的开发。然而,到目前为止,分子性质预测方法仅限于有大量数据可用于检测的方法。在这项研究中,我们开发了一种新的计算方法,利用感兴趣的检测的文本描述和目标化合物的化学结构。通过通过自监督学习将这两个信息源结合起来,我们的工具可以为没有测量值的检测提供准确的预测。值得注意的是,我们的方法在 FS-Mol 零镜头预测基准上实现了最先进的性能,超过了各种深度学习方法。此外,我们展示了如何将我们的工具用于定制感兴趣的检测的筛选库,在对高通量筛选活动的回顾性案例研究中表现出了有希望的性能。通过加速药物发现和开发中活性分子的早期鉴定,这种方法有可能简化新型疗法的鉴定。