Bachorz Rafał A, Nowak Damian, Ratajewski Marcin
Institute of Medical Biology, Polish Academy of Sciences, Łódź, Poland.
Department of Quantum Chemistry, Faculty of Chemistry, Adam Mickiewicz University, Poznań, Poland.
Front Bioinform. 2024 Sep 23;4:1441024. doi: 10.3389/fbinf.2024.1441024. eCollection 2024.
The drug design process can be successfully supported using a variety of methods. Some of these are oriented toward molecular property prediction, which is a key step in the early drug discovery stage. Before experimental validation, drug candidates are usually compared with known experimental data. Technically, this can be achieved using machine learning approaches, in which selected experimental data are used to train the predictive models. The proposed Python software is designed for this purpose. It supports the entire workflow of molecular data processing, starting from raw data preparation followed by molecular descriptor creation and machine learning model training. The predictive capabilities of the resulting models were carefully validated internally and externally. These models can be easily applied to new compounds, including within more complex workflows involving generative approaches.
药物设计过程可以通过多种方法得到成功支持。其中一些方法侧重于分子性质预测,这是药物发现早期阶段的关键步骤。在进行实验验证之前,通常会将候选药物与已知的实验数据进行比较。从技术上讲,这可以通过机器学习方法实现,即使用选定的实验数据来训练预测模型。所提出的Python软件就是为此目的而设计的。它支持分子数据处理的整个工作流程,从原始数据准备开始,接着是分子描述符创建和机器学习模型训练。最终模型的预测能力在内部和外部都经过了仔细验证。这些模型可以很容易地应用于新化合物,包括在涉及生成方法的更复杂工作流程中。