Suppr超能文献

利用大数据集和机器学习算法,在明确的适用域内开发针对 PPARγ 结合亲和力的 QSAR 模型。

Developing QSAR Models with Defined Applicability Domains on PPARγ Binding Affinity Using Large Data Sets and Machine Learning Algorithms.

机构信息

Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China.

National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arkansas 72079, United States.

出版信息

Environ Sci Technol. 2021 May 18;55(10):6857-6866. doi: 10.1021/acs.est.0c07040. Epub 2021 Apr 29.

Abstract

Chemicals may cause adverse effects on human health through binding to peroxisome proliferator-activated receptor γ (PPARγ). Hence, binding affinity is useful for evaluating chemicals with potential endocrine-disrupting effects. Quantitative structure-activity relationship (QSAR) regression models with defined applicability domains (ADs) are important to enable efficient screening of chemicals with PPARγ binding activity. However, lack of large data sets hindered the development of QSAR models. In this study, based on PPARγ binding affinity data sets curated from various sources, 30 QSAR models were developed using molecular fingerprints, two-dimensional descriptors, and five machine learning algorithms. Structure-activity landscapes (SALs) of the training compounds were described by network-like similarity graphs (NSGs). Based on the NSGs, local discontinuity scores were calculated and found to be positively correlated with the cross-validation absolute prediction errors of the models using the different training sets, descriptors, and algorithms. Moreover, innovative ADs were defined based on pairwise similarities between compounds and were found to outperform some conventional ADs. The curated data sets and developed regression models could be useful for evaluating PPARγ-involved adverse effects of chemicals. The SAL analysis and the innovative ADs could facilitate understanding of prediction results from QSAR models.

摘要

化学物质可能通过与过氧化物酶体增殖物激活受体 γ(PPARγ)结合对人类健康产生不良影响。因此,结合亲和力可用于评估具有潜在内分泌干扰作用的化学物质。具有明确适用域(AD)的定量构效关系(QSAR)回归模型对于有效筛选具有 PPARγ 结合活性的化学物质非常重要。然而,缺乏大型数据集阻碍了 QSAR 模型的发展。在这项研究中,基于从各种来源整理的 PPARγ 结合亲和力数据集,使用分子指纹、二维描述符和五种机器学习算法开发了 30 个 QSAR 模型。通过网络相似性图(NSG)描述了训练化合物的结构活性景观(SAL)。基于 NSG,计算了局部不连续性得分,并发现与使用不同训练集、描述符和算法的模型的交叉验证绝对预测误差呈正相关。此外,基于化合物之间的成对相似性定义了创新的 AD,发现其优于一些传统的 AD。整理后的数据集和开发的回归模型可用于评估化学物质对 PPARγ 的不良影响。SAL 分析和创新的 AD 可促进对 QSAR 模型预测结果的理解。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验