Qin Yongfei, Li Chao, Shi Xia, Wang Weigang
School of Statistics and Mathematics, Zhejiang Gongshang University, Hangzhou, China.
Collaborative Innovation Center of Statistical Data Engineering, Technology and Application, Zhejiang Gongshang University, Hangzhou, China.
Front Bioeng Biotechnol. 2022 Jul 13;10:946329. doi: 10.3389/fbioe.2022.946329. eCollection 2022.
The development of breast cancer is closely linked to the estrogen receptor ERα, which is also considered to be an important target for the treatment of breast cancer. Therefore, compounds that can antagonize ERα activity may be drug candidates for the treatment of breast cancer. In drug development, to save manpower and resources, potential active compounds are often screened by establishing compound activity prediction model. For the 1974 compounds collected, the top 20 molecular descriptors that significantly affected the biological activity were screened using LASSO regression models combined with 10-fold cross-validation method. Further, a regression prediction model based on the MLP fully connected neural network was constructed to predict the bioactivity values of 50 new compounds. To measure the validity of the model, the model loss term was specified as the mean squared error (MSE). The results showed that the MLP-based regression prediction model had a loss value of 0.0146 on the validation set. This model is therefore well trained and the prediction strategy used is valid. The methods developed by this paper may provide a reference for the development of anti-breast cancer drugs.
乳腺癌的发展与雌激素受体ERα密切相关,ERα也被认为是乳腺癌治疗的重要靶点。因此,能够拮抗ERα活性的化合物可能是治疗乳腺癌的候选药物。在药物研发中,为了节省人力和资源,通常通过建立化合物活性预测模型来筛选潜在的活性化合物。对于收集到的1974种化合物,使用LASSO回归模型结合10折交叉验证方法筛选出对生物活性有显著影响的前20个分子描述符。此外,构建了基于MLP全连接神经网络的回归预测模型,以预测50种新化合物的生物活性值。为了衡量模型的有效性,将模型损失项指定为均方误差(MSE)。结果表明,基于MLP的回归预测模型在验证集上的损失值为0.0146。因此,该模型训练良好,所采用的预测策略有效。本文所开发的方法可能为抗乳腺癌药物的研发提供参考。