Vijayaraghavan Sudharsan, Lakshminarayanan Akshaya, Bhargava Naman, Ravichandran Janani, Vivek-Ananth R P, Samal Areejit
The Institute of Mathematical Sciences (IMSc), Chennai 600113, India.
Department of Applied Mathematics and Computational Sciences, PSG College of Technology, Coimbatore 641004, India.
ACS Omega. 2024 Mar 6;9(11):13006-13016. doi: 10.1021/acsomega.3c09392. eCollection 2024 Mar 19.
Breast milk serves as a vital source of essential nutrients for infants. However, human milk contamination via the transfer of environmental chemicals from maternal exposome is a significant concern for infant health. The milk to plasma concentration (M/P) ratio is a critical metric that quantifies the extent to which these chemicals transfer from maternal plasma into breast milk, impacting infant exposure. Machine learning-based predictive toxicology models can be valuable in predicting chemicals with a high propensity to transfer into human milk. To this end, we build such classification- and regression-based models by employing multiple machine learning algorithms and leveraging the largest curated data set, to date, of 375 chemicals with known milk-to-plasma concentration (M/P) ratios. Our support vector machine (SVM)-based classifier outperforms other models in terms of different performance metrics, when evaluated on both (internal) test data and an external test data set. Specifically, the SVM-based classifier on (internal) test data achieved a classification accuracy of 77.33%, a specificity of 84%, a sensitivity of 64%, and an -score of 65.31%. When evaluated on an external test data set, our SVM-based classifier is found to be generalizable with a sensitivity of 77.78%. While we were able to build highly predictive classification models, our best regression models for predicting the M/P ratio of chemicals could achieve only moderate values on the (internal) test data. As noted in the earlier literature, our study also highlights the challenges in developing accurate regression models for predicting the M/P ratio of xenobiotic chemicals. Overall, this study attests to the immense potential of predictive computational toxicology models in characterizing the myriad of chemicals in the human exposome.
母乳是婴儿必需营养素的重要来源。然而,母体暴露组中的环境化学物质通过转移导致人乳污染是婴儿健康的一个重大问题。乳浆浓度(M/P)比是一个关键指标,用于量化这些化学物质从母体血浆转移到母乳中的程度,从而影响婴儿的暴露情况。基于机器学习的预测毒理学模型在预测具有高转移到人乳倾向的化学物质方面可能很有价值。为此,我们通过采用多种机器学习算法并利用迄今为止最大的经过整理的数据集(包含375种已知乳浆浓度(M/P)比的化学物质)构建了这种基于分类和回归的模型。当在(内部)测试数据和外部测试数据集上进行评估时,我们基于支持向量机(SVM)的分类器在不同性能指标方面优于其他模型。具体而言,基于(内部)测试数据的SVM分类器的分类准确率为77.33%,特异性为84%,灵敏度为64%,F1分数为65.31%。在外部测试数据集上进行评估时,我们发现基于SVM的分类器具有可推广性,灵敏度为77.78%。虽然我们能够构建高度预测性的分类模型,但我们用于预测化学物质M/P比的最佳回归模型在(内部)测试数据上只能达到中等的F1值。正如早期文献中所指出的,我们的研究还强调了开发准确回归模型以预测外源性化学物质M/P比的挑战。总体而言,这项研究证明了预测计算毒理学模型在表征人类暴露组中无数化学物质方面的巨大潜力。