Wang Chia-Chi, Lin Pinpin, Chou Che-Yu, Wang Shan-Shan, Tung Chun-Wei
Department and Graduate Institute of Veterinary Medicine, School of Veterinary Medicine, National Taiwan University, Taipei, Taiwan.
National Institute of Environmental Health Sciences, National Health Research Institutes, Miaoli County, Taiwan.
PeerJ. 2020 Jul 21;8:e9562. doi: 10.7717/peerj.9562. eCollection 2020.
The measurement of human fetal-maternal blood concentration ratio (logFM) of chemicals is critical for the risk assessment of chemical-induced developmental toxicity. While a few in vitro and ex vivo experimental methods were developed for predicting logFM of chemicals, the obtained experimental results are not able to directly predict in vivo outcomes.
A total of 55 chemicals with logFM values representing in vivo fetal-maternal blood ratio were divided into training and test datasets. An interpretable linear regression model was developed along with feature selection methods. Cross-validation on training dataset and prediction on independent test dataset were conducted to validate the prediction model.
This study presents the first valid quantitative structure-activity relationship model following the Organisation for Economic Co-operation and Development (OECD) guidelines based on multiple linear regression for predicting in vivo logFM values. The autocorrelation descriptor AATSC1c and information content descriptor ZMIC1 were identified as informative features for predicting logFM. After the adjustment of the applicability domain, the developed model performs well with correlation coefficients of 0.875, 0.850 and 0.847 for model fitting, leave-one-out cross-validation and independent test, respectively. The model is expected to be useful for assessing human transplacental exposure.
测量化学物质的人胎儿-母体血液浓度比(logFM)对于评估化学物质诱导的发育毒性风险至关重要。虽然已经开发了一些体外和离体实验方法来预测化学物质的logFM,但所获得的实验结果无法直接预测体内结果。
将总共55种具有代表体内胎儿-母体血液比例的logFM值的化学物质分为训练数据集和测试数据集。开发了一个可解释的线性回归模型以及特征选择方法。对训练数据集进行交叉验证,并对独立测试数据集进行预测,以验证预测模型。
本研究提出了第一个符合经济合作与发展组织(OECD)指南的有效定量构效关系模型,该模型基于多元线性回归来预测体内logFM值。自相关描述符AATSC1c和信息含量描述符ZMIC1被确定为预测logFM的信息特征。在调整适用范围后,所开发的模型表现良好,模型拟合、留一法交叉验证和独立测试的相关系数分别为0.875、0.850和0.847。该模型有望用于评估人类经胎盘暴露情况。