Vigna V, Cova T F G G, Pais A A C C, Sicilia E
PROMOCS Laboratory, Department of Chemistry and Chemical Technologies, University of Calabria, Arcavacata di Rende (CS), Italy.
Coimbra Chemistry Centre, Department of Chemistry, Institute of Molecular Sciences (IMS), Faculty of Sciences and Technology, University of Coimbra, Coimbra, Portugal.
J Cheminform. 2025 Jan 5;17(1):1. doi: 10.1186/s13321-024-00939-5.
Effective light-based cancer treatments, such as photodynamic therapy (PDT) and photoactivated chemotherapy (PACT), rely on compounds that are activated by light efficiently, and absorb within the therapeutic window (600-850 nm). Traditional prediction methods for these light absorption properties, including Time-Dependent Density Functional Theory (TDDFT), are often computationally intensive and time-consuming. In this study, we explore a machine learning (ML) approach to predict the light absorption in the region of the therapeutic window of platinum, iridium, ruthenium, and rhodium complexes, aiming at streamlining the screening of potential photoactivatable prodrugs. By compiling a dataset of 9775 complexes from the Reaxys database, we trained six classification models, including random forests, support vector machines, and neural networks, utilizing various molecular descriptors. Our findings indicate that the Extreme Gradient Boosting Classifier (XGBC) paired with AtomPairs2D descriptors delivers the highest predictive accuracy and robustness. This ML-based method significantly accelerates the identification of suitable compounds, providing a valuable tool for the early-stage design and development of phototherapy drugs. The method also allows to change relevant structural characteristics of a base molecule using information from the supervised approach.Scientific Contribution: The proposed machine learning (ML) approach predicts the ability of transition metal-based complexes to absorb light in the UV-vis therapeutic window, a key trait for phototherapeutic agents. While ML models have been used to predict UV-vis properties of organic molecules, applying this to metal complexes is novel. The model is efficient, fast, and resource-light, using decision tree-based algorithms that provide interpretable results. This interpretability helps to understand classification rules and facilitates targeted structural modifications to convert inactive complexes into potentially active ones.
有效的基于光的癌症治疗方法,如光动力疗法(PDT)和光活化化疗(PACT),依赖于能被光有效激活并在治疗窗口(600 - 850纳米)内吸收的化合物。用于这些光吸收特性的传统预测方法,包括含时密度泛函理论(TDDFT),通常计算量很大且耗时。在本研究中,我们探索了一种机器学习(ML)方法来预测铂、铱、钌和铑配合物在治疗窗口区域的光吸收,旨在简化潜在光活化前药的筛选。通过从Reaxys数据库编译一个包含9775种配合物的数据集,我们利用各种分子描述符训练了六个分类模型,包括随机森林、支持向量机和神经网络。我们的研究结果表明,与二维原子对描述符配对的极端梯度提升分类器(XGBC)具有最高的预测准确性和稳健性。这种基于ML的方法显著加速了合适化合物的识别,为光疗药物的早期设计和开发提供了一个有价值的工具。该方法还允许利用监督方法中的信息改变基础分子相关的结构特征。科学贡献:所提出的机器学习(ML)方法预测了过渡金属基配合物在紫外 - 可见治疗窗口吸收光的能力,这是光治疗剂的一个关键特性。虽然ML模型已被用于预测有机分子的紫外 - 可见性质,但将其应用于金属配合物是新颖之举。该模型高效、快速且资源消耗少,使用基于决策树的算法,能提供可解释的结果。这种可解释性有助于理解分类规则,并便于进行有针对性的结构修饰以将无活性配合物转化为潜在活性配合物