Mayr Fritz, Wieder Marcus, Wieder Oliver, Langer Thierry
Department of Pharmaceutical Sciences, Pharmaceutical Chemistry Division, University of Vienna, Vienna, Austria.
Front Chem. 2022 May 26;10:866585. doi: 10.3389/fchem.2022.866585. eCollection 2022.
Enumerating protonation states and calculating microstate pK values of small molecules is an important yet challenging task for lead optimization and molecular modeling. Commercial and non-commercial solutions have notable limitations such as restrictive and expensive licenses, high CPU/GPU hour requirements, or the need for expert knowledge to set up and use. We present a graph neural network model that is trained on 714,906 calculated microstate pK predictions from molecules obtained from the ChEMBL database. The model is fine-tuned on a set of 5,994 experimental pK values significantly improving its performance on two challenging test sets. Combining the graph neural network model with Dimorphite-DL, an open-source program for enumerating ionization states, we have developed the open-source Python package pkasolver, which is able to generate and enumerate protonation states and calculate pK values with high accuracy.
枚举小分子的质子化状态并计算微状态pK值是先导化合物优化和分子建模中一项重要但具有挑战性的任务。商业和非商业解决方案都有显著的局限性,如许可证限制且昂贵、对CPU/ GPU小时数要求高,或者需要专业知识来设置和使用。我们提出了一种图神经网络模型,该模型基于从ChEMBL数据库获得的分子的714,906个计算出的微状态pK预测进行训练。该模型在一组5,994个实验pK值上进行了微调,显著提高了其在两个具有挑战性的测试集上的性能。将图神经网络模型与用于枚举电离状态的开源程序Dimorphite-DL相结合,我们开发了开源Python包pkasolver,它能够生成和枚举质子化状态并高精度计算pK值。