Institute of Pharmaceutical and Medicinal Chemistry, Westfälische Wilhelms-Universität Münster, Corrensstraße 48, Münster 48149, Germany.
Center for Multiscale Theory and Computation, Westfälische Wilhelms-Universität Münster, Corrensstraße 48, Münster 48149, Germany.
J Chem Inf Model. 2021 Feb 22;61(2):664-675. doi: 10.1021/acs.jcim.0c01208. Epub 2021 Jan 26.
Similarity-based virtual screening is a fundamental tool in the early drug discovery process and relies heavily on molecular fingerprints. We propose a novel strategy of generating domain-specific fingerprints by training neural networks on target-specific bioactivity datasets and using the activation as a new molecular representation. The neural network is expected to combine information of already known bioactive compounds with unique information of the molecular structure and by doing so enrich the fingerprint. We evaluate this strategy on a large kinase-specific bioactivity dataset. A comparison of five neural network architectures and their fingerprints to the well-established extended-connectivity fingerprint (ECFP) and an autoencoder shows that our neural fingerprint produces better results in the similarity search. Most importantly, the neural fingerprint performs well even when specific targets are not included during training. Surprisingly, while Graph Neural Networks (GNNs) are thought to offer an advantageous alternative, the best performing neural fingerprints were based on traditional fully connected layers using the ECFP4 as the input. The neural fingerprint is freely available at: https://github.com/kochgroup/kinase_nnfp.
基于相似度的虚拟筛选是药物发现早期过程中的一个基本工具,它严重依赖于分子指纹。我们提出了一种新的策略,通过在特定于目标的生物活性数据集上训练神经网络,并使用激活作为新的分子表示来生成特定于域的指纹。预计神经网络将结合已经已知的生物活性化合物的信息与分子结构的独特信息,并通过这样来丰富指纹。我们在一个大型激酶特异性生物活性数据集上评估了这种策略。五种神经网络架构及其指纹与成熟的扩展连接指纹 (ECFP) 和自动编码器的比较表明,我们的神经网络指纹在相似性搜索中产生了更好的结果。最重要的是,即使在训练过程中不包括特定的目标,神经网络指纹也能很好地发挥作用。令人惊讶的是,尽管图神经网络 (GNN) 被认为是一种有利的替代方案,但表现最好的神经网络指纹是基于传统的全连接层,使用 ECFP4 作为输入。神经网络指纹可在以下网址免费获取:https://github.com/kochgroup/kinase_nnfp。