Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China.
State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China.
Environ Sci Technol. 2023 Nov 21;57(46):17762-17773. doi: 10.1021/acs.est.2c04400. Epub 2022 Oct 25.
More than 7000 per- and polyfluorinated alkyl substances (PFAS) have been documented in the U.S. Environmental Protection Agency's CompTox Chemicals database. These PFAS can be used in a broad range of industrial and consumer applications but may pose potential environmental issues and health risks. However, little is known about emerging PFAS bioaccumulation to assess their chemical safety. This study focuses specifically on the large and high-quality data set of fluorochemicals from the related environmental and pharmaceutical chemicals databases, and machine learning (ML) models were developed for the classification prediction of the unbound fraction of compounds in plasma. A comprehensive evaluation of the ML models shows that the best blending model yields an accuracy of 0.901 for the test set. The predictions suggest that most PFAS (∼92%) have a high binding fraction in plasma. Introduction of alkaline amino groups is likely to reduce the binding affinities of PFAS with plasma proteins. Molecular dynamics simulations indicate a clear distinction between the high and low binding fractions of PFAS. These computational workflows can be used to predict the bioaccumulation of emerging PFAS and are also helpful for the molecular design of PFAS to prevent the release of high-bioaccumulation compounds into the environment.
在美国环境保护署的 CompTox Chemicals 数据库中记录了超过 7000 种全氟和多氟烷基物质 (PFAS)。这些 PFAS 可广泛应用于工业和消费应用,但可能存在潜在的环境问题和健康风险。然而,对于新兴的 PFAS 生物累积,人们知之甚少,无法评估其化学安全性。本研究专门关注来自相关环境和制药化学数据库的大量高质量氟化学品数据集,并为血浆中化合物未结合分数的分类预测开发了机器学习 (ML) 模型。对 ML 模型的综合评估表明,最佳混合模型对测试集的准确率为 0.901。预测结果表明,大多数 PFAS(约 92%)在血浆中有很高的结合分数。引入碱性氨基酸基团可能会降低 PFAS 与血浆蛋白的结合亲和力。分子动力学模拟表明 PFAS 的高结合分数和低结合分数之间存在明显区别。这些计算工作流程可用于预测新兴 PFAS 的生物累积,也有助于 PFAS 的分子设计,以防止高生物累积化合物释放到环境中。