Suppr超能文献

环境定量构效关系(QSAR)数据的多变量分析。第一部分——基于主成分分析(PCA)、偏最小二乘法(PLS)和统计分子设计(SMD)的基本框架。

Megavariate analysis of environmental QSAR data. Part I--a basic framework founded on principal component analysis (PCA), partial least squares (PLS), and statistical molecular design (SMD).

作者信息

Eriksson Lennart, Andersson Patrik L, Johansson Erik, Tysklind Mats

机构信息

Umetrics AB, POB 7960, S-907 19, Umeå, Sweden,

出版信息

Mol Divers. 2006 May;10(2):169-86. doi: 10.1007/s11030-006-9024-6. Epub 2006 Jun 13.

Abstract

This paper introduces principal component analysis (PCA), partial least squares projections to latent structures (PLS), and statistical molecular design (SMD) as useful tools in deriving multi- and megavariate quantitative structure-activity relationship (QSAR) models. Two QSAR data sets from the fields of environmental toxicology and environmental chemistry are worked out in detail, showing the benefits of PCA, PLS and SMD. PCA is useful when overviewing a data set and exploring relationships among compounds and relationships among variables. PLS is the regression extension of PCA and is used for establishing QSARs. SMD is essential for selecting informative training and test sets of compounds for QSAR calibration and validation.

摘要

本文介绍了主成分分析(PCA)、偏最小二乘判别分析(PLS)和统计分子设计(SMD),它们是推导多变量和超变量定量构效关系(QSAR)模型的有用工具工具。详细研究了来自环境毒理学和环境化学领域的两个QSAR数据集,展示了PCA、PLS和SMD的优势。PCA在概述数据集以及探索化合物之间的关系和变量之间的关系时很有用。PLS是PCA的回归扩展,用于建立QSAR。SMD对于选择用于QSAR校准和验证的信息丰富的化合物训练集和测试集至关重要。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验