Laboratory for Molecular Modeling, Division of Medicinal Chemistry and Natural Products, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, USA.
Environ Health Perspect. 2011 Mar;119(3):364-70. doi: 10.1289/ehp.1002476. Epub 2010 Oct 27.
Quantitative high-throughput screening (qHTS) assays are increasingly being used to inform chemical hazard identification. Hundreds of chemicals have been tested in dozens of cell lines across extensive concentration ranges by the National Toxicology Program in collaboration with the National Institutes of Health Chemical Genomics Center.
Our goal was to test a hypothesis that dose-response data points of the qHTS assays can serve as biological descriptors of assayed chemicals and, when combined with conventional chemical descriptors, improve the accuracy of quantitative structure-activity relationship (QSAR) models applied to prediction of in vivo toxicity end points.
We obtained cell viability qHTS concentration-response data for 1,408 substances assayed in 13 cell lines from PubChem; for a subset of these compounds, rodent acute toxicity half-maximal lethal dose (LD50) data were also available. We used the k nearest neighbor classification and random forest QSAR methods to model LD50 data using chemical descriptors either alone (conventional models) or combined with biological descriptors derived from the concentration-response qHTS data (hybrid models). Critical to our approach was the use of a novel noise-filtering algorithm to treat qHTS data.
Both the external classification accuracy and coverage (i.e., fraction of compounds in the external set that fall within the applicability domain) of the hybrid QSAR models were superior to conventional models.
Concentration-response qHTS data may serve as informative biological descriptors of molecules that, when combined with conventional chemical descriptors, may considerably improve the accuracy and utility of computational approaches for predicting in vivo animal toxicity end points.
高通量筛选(qHTS)测定法越来越多地用于鉴定化学危害。在全国毒理学计划与国立卫生研究院化学基因组学中心合作下,数以百计的化学物质已在数十种细胞系中进行了广泛浓度范围的测试。
我们的目标是验证一个假设,即 qHTS 测定的剂量-反应数据点可以作为所测化学物质的生物学描述符,并且与常规化学描述符结合使用时,可以提高定量结构-活性关系(QSAR)模型应用于预测体内毒性终点的准确性。
我们从 PubChem 获得了 13 种细胞系中 1408 种物质的细胞活力 qHTS 浓度-反应数据;对于这些化合物中的一部分,也有啮齿动物急性毒性半数致死剂量(LD50)数据。我们使用 k 最近邻分类和随机森林 QSAR 方法,使用化学描述符单独(常规模型)或与从浓度-反应 qHTS 数据得出的生物学描述符相结合(混合模型)来对 LD50 数据进行建模。我们的方法关键在于使用新颖的噪声过滤算法来处理 qHTS 数据。
混合 QSAR 模型的外部分类准确性和覆盖率(即在外部集化合物中属于适用域的化合物的分数)均优于常规模型。
浓度-反应 qHTS 数据可以作为分子的信息生物学描述符,与常规化学描述符结合使用时,可以大大提高计算方法预测体内动物毒性终点的准确性和实用性。