• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过学习潜在空间化学表示来预测化学物质的生态毒性。

Predicting chemical ecotoxicity by learning latent space chemical representations.

作者信息

Gao Feng, Zhang Wei, Baccarelli Andrea A, Shen Yike

机构信息

Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, NY 10032, United States.

Department of Plant, Soil and Microbial Sciences, Michigan State University, East Lansing, MI 48823, United States.

出版信息

Environ Int. 2022 May;163:107224. doi: 10.1016/j.envint.2022.107224. Epub 2022 Apr 1.

DOI:10.1016/j.envint.2022.107224
PMID:35395577
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9044254/
Abstract

In silico prediction of chemical ecotoxicity (HC) represents an important complement to improve in vivo and in vitro toxicological assessment of manufactured chemicals. Recent application of machine learning models to predict chemical HC yields variable prediction performance that depends on effectively learning chemical representations from high-dimension data. To improve HC prediction performance, we developed an autoencoder model by learning latent space chemical embeddings. This novel approach achieved state-of-the-art prediction performance of HC with R of 0.668 ± 0.003 and mean absolute error (MAE) of 0.572 ± 0.001, and outperformed other dimension reduction methods including principal component analysis (PCA) (R = 0.601 ± 0.031 and MAE = 0.629 ± 0.005), kernel PCA (R = 0.631 ± 0.008 and MAE = 0.625 ± 0.006), and uniform manifold approximation and projection dimensionality reduction (R = 0.400 ± 0.008 and MAE = 0.801 ± 0.002). A simple linear layer with chemical embeddings learned from the autoencoder model performed better than random forest (R = 0.663 ± 0.007 and MAE = 0.591 ± 0.008), fully connected neural network (R = 0.614 ± 0.016 and MAE = 0.610 ± 0.008), least absolute shrinkage and selection operator (R = 0.617 ± 0.037 and MAE = 0.619 ± 0.007), and ridge regression (R = 0.638 ± 0.007 and MAE = 0.613 ± 0.005) using unlearned raw input features. Our results highlighted the usefulness of learning latent chemical representations, and our autoencoder model provides an alternative approach for robust HC prediction.

摘要

化学生态毒性(HC)的计算机模拟预测是改进人造化学品体内和体外毒理学评估的重要补充。最近应用机器学习模型预测化学HC的结果显示,其预测性能各不相同,这取决于能否有效地从高维数据中学习化学表征。为了提高HC预测性能,我们通过学习潜在空间化学嵌入开发了一种自动编码器模型。这种新方法实现了HC的最优预测性能,相关系数R为0.668±0.003,平均绝对误差(MAE)为0.572±0.001,并且优于其他降维方法,包括主成分分析(PCA)(R = 0.601±0.031,MAE = 0.629±0.005)、核主成分分析(R = 0.631±0.008,MAE = 0.625±0.006)以及均匀流形逼近与投影降维(R = 0.400±0.008,MAE = 0.801±0.002)。使用从自动编码器模型学习到的化学嵌入的简单线性层,其性能优于使用未学习的原始输入特征的随机森林(R = 0.663±0.007,MAE = 0.591±0.008)、全连接神经网络(R = 0.614±0.016,MAE = 0.610±0.008)、最小绝对收缩和选择算子(R = 0.617±0.037,MAE = 0.619±0.007)以及岭回归(R = 0.638±0.007,MAE = 0.613±0.005)。我们的结果突出了学习潜在化学表征的有用性,并且我们的自动编码器模型为可靠的HC预测提供了一种替代方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/977f/9044254/ac061c7370bb/nihms-1796897-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/977f/9044254/eaa1de960849/nihms-1796897-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/977f/9044254/7bec697d7c3f/nihms-1796897-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/977f/9044254/78b80590120f/nihms-1796897-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/977f/9044254/5ab6ac6d0b55/nihms-1796897-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/977f/9044254/ac061c7370bb/nihms-1796897-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/977f/9044254/eaa1de960849/nihms-1796897-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/977f/9044254/7bec697d7c3f/nihms-1796897-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/977f/9044254/78b80590120f/nihms-1796897-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/977f/9044254/5ab6ac6d0b55/nihms-1796897-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/977f/9044254/ac061c7370bb/nihms-1796897-f0005.jpg

相似文献

1
Predicting chemical ecotoxicity by learning latent space chemical representations.通过学习潜在空间化学表示来预测化学物质的生态毒性。
Environ Int. 2022 May;163:107224. doi: 10.1016/j.envint.2022.107224. Epub 2022 Apr 1.
2
Predictive modeling of blood pressure during hemodialysis: a comparison of linear model, random forest, support vector regression, XGBoost, LASSO regression and ensemble method.血液透析期间血压的预测建模:线性模型、随机森林、支持向量回归、XGBoost、LASSO回归及集成方法的比较
Comput Methods Programs Biomed. 2020 Oct;195:105536. doi: 10.1016/j.cmpb.2020.105536. Epub 2020 May 22.
3
Predicting crop root concentration factors of organic contaminants with machine learning models.使用机器学习模型预测作物根系对有机污染物的富集系数。
J Hazard Mater. 2022 Feb 15;424(Pt B):127437. doi: 10.1016/j.jhazmat.2021.127437. Epub 2021 Oct 5.
4
Estimate ecotoxicity characterization factors for chemicals in life cycle assessment using machine learning models.使用机器学习模型估算生命周期评估中化学品的生态毒性特征化因子。
Environ Int. 2020 Feb;135:105393. doi: 10.1016/j.envint.2019.105393. Epub 2019 Dec 18.
5
Combining handcrafted features with latent variables in machine learning for prediction of radiation-induced lung damage.将机器学习中的手工特征与潜在变量相结合,以预测放射性肺损伤。
Med Phys. 2019 May;46(5):2497-2511. doi: 10.1002/mp.13497. Epub 2019 Apr 8.
6
Learning Relationships Between Chemical and Physical Stability for Peptide Drug Development.学习化学和物理稳定性之间的关系,以促进肽类药物的开发。
Pharm Res. 2023 Mar;40(3):701-710. doi: 10.1007/s11095-023-03475-3. Epub 2023 Feb 16.
7
Automated Landslide-Risk Prediction Using Web GIS and Machine Learning Models.基于 WebGIS 和机器学习模型的自动化滑坡风险预测
Sensors (Basel). 2021 Jul 5;21(13):4620. doi: 10.3390/s21134620.
8
Deep learning methods for protein torsion angle prediction.用于蛋白质扭转角预测的深度学习方法。
BMC Bioinformatics. 2017 Sep 18;18(1):417. doi: 10.1186/s12859-017-1834-2.
9
Linear and Non-linear Dimensionality-Reduction Techniques on Full Hand Kinematics.全手运动学的线性和非线性降维技术
Front Bioeng Biotechnol. 2020 May 5;8:429. doi: 10.3389/fbioe.2020.00429. eCollection 2020.
10
Data Integration Using Advances in Machine Learning in Drug Discovery and Molecular Biology.利用机器学习进展进行药物发现和分子生物学中的数据整合
Methods Mol Biol. 2021;2190:167-184. doi: 10.1007/978-1-0716-0826-5_7.

引用本文的文献

1
A Review of the Applications, Benefits, and Challenges of Generative AI for Sustainable Toxicology.生成式人工智能在可持续毒理学中的应用、益处及挑战综述
Curr Res Toxicol. 2025 Apr 21;8:100232. doi: 10.1016/j.crtox.2025.100232. eCollection 2025.
2
AESurv: autoencoder survival analysis for accurate early prediction of coronary heart disease.AESurv:用于准确预测冠心病的自动编码器生存分析。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae479.

本文引用的文献

1
CATMoS: Collaborative Acute Toxicity Modeling Suite.CATMoS:协作急性毒性建模套件。
Environ Health Perspect. 2021 Apr;129(4):47013. doi: 10.1289/EHP8495. Epub 2021 Apr 30.
2
A database framework for rapid screening of structure-function relationships in PFAS chemistry.用于快速筛选 PFAS 化学中结构-功能关系的数据库框架。
Sci Data. 2021 Jan 18;8(1):14. doi: 10.1038/s41597-021-00798-x.
3
Examining Uncertainty in In Vitro-In Vivo Extrapolation Applied in Fish Bioconcentration Models.探讨鱼类生物浓缩模型中体外-体内外推的不确定性。
Environ Sci Technol. 2020 Aug 4;54(15):9483-9494. doi: 10.1021/acs.est.0c01492. Epub 2020 Jul 24.
4
Estimate ecotoxicity characterization factors for chemicals in life cycle assessment using machine learning models.使用机器学习模型估算生命周期评估中化学品的生态毒性特征化因子。
Environ Int. 2020 Feb;135:105393. doi: 10.1016/j.envint.2019.105393. Epub 2019 Dec 18.
5
Predicting the acute ecotoxicity of chemical substances by machine learning using graph theory.基于图论的机器学习预测化学物质的急性生态毒性。
Chemosphere. 2020 Jan;238:124604. doi: 10.1016/j.chemosphere.2019.124604. Epub 2019 Aug 16.
6
Target site model: Predicting mode of action and aquatic organism acute toxicity using Abraham parameters and feature-weighted k-nearest neighbors classification.靶标-site 模型:使用 Abraham 参数和特征加权 K-最近邻分类法预测作用模式和水生生物急性毒性。
Environ Toxicol Chem. 2019 Feb;38(2):375-386. doi: 10.1002/etc.4324. Epub 2019 Jan 22.
7
Target site model: Application of the polyparameter target lipid model to predict aquatic organism acute toxicity for various modes of action.靶标模型:多参数靶脂质模型在各种作用模式下预测水生生物急性毒性的应用。
Environ Toxicol Chem. 2019 Jan;38(1):222-239. doi: 10.1002/etc.4278. Epub 2018 Dec 3.
8
Error bounds for approximations with deep ReLU networks.深度 ReLU 网络逼近的误差界。
Neural Netw. 2017 Oct;94:103-114. doi: 10.1016/j.neunet.2017.07.002. Epub 2017 Jul 13.
9
Deep learning.深度学习。
Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.
10
MOAtox: A comprehensive mode of action and acute aquatic toxicity database for predictive model development.MOAtox:用于预测模型开发的综合作用模式和急性水生毒性数据库。
Aquat Toxicol. 2015 Apr;161:102-7. doi: 10.1016/j.aquatox.2015.02.001. Epub 2015 Feb 7.