Hughes-Oliver Jacqueline M, Brooks Atina D, Welch William J, Khaledi Morteza G, Hawkins Douglas, Young S Stanley, Patil Kirtesh, Howell Gary W, Ng Raymond T, Chu Moody T
Department of Statistics, North Carolina State University, Raleigh, NC, USA.
In Silico Biol. 2011;11(1-2):61-81. doi: 10.3233/CI-2008-0016.
ChemModLab, written by the ECCR @ NCSU consortium under NIH support, is a toolbox for fitting and assessing quantitative structure-activity relationships (QSARs). Its elements are: a cheminformatic front end used to supply molecular descriptors for use in modeling; a set of methods for fitting models; and methods for validating the resulting model. Compounds may be input as structures from which standard descriptors will be calculated using the freely available cheminformatic front end PowerMV; PowerMV also supports compound visualization. In addition, the user can directly input their own choices of descriptors, so the capability for comparing descriptors is effectively unlimited. The statistical methodologies comprise a comprehensive collection of approaches whose validity and utility have been accepted by experts in the fields. As far as possible, these tools are implemented in open-source software linked into the flexible R platform, giving the user the capability of applying many different QSAR modeling methods in a seamless way. As promising new QSAR methodologies emerge from the statistical and data-mining communities, they will be incorporated in the laboratory. The web site also incorporates links to public-domain data sets that can be used as test cases for proposed new modeling methods. The capabilities of ChemModLab are illustrated using a variety of biological responses, with different modeling methodologies being applied to each. These show clear differences in quality of the fitted QSAR model, and in computational requirements. The laboratory is web-based, and use is free. Researchers with new assay data, a new descriptor set, or a new modeling method may readily build QSAR models and benchmark their results against other findings. Users may also examine the diversity of the molecules identified by a QSAR model. Moreover, users have the choice of placing their data sets in a public area to facilitate communication with other researchers; or can keep them hidden to preserve confidentiality.
ChemModLab由北卡罗来纳州立大学的ECCR联盟在国立卫生研究院的支持下编写,是一个用于拟合和评估定量构效关系(QSAR)的工具箱。它的组成部分包括:一个化学信息学前端,用于提供用于建模的分子描述符;一组用于拟合模型的方法;以及用于验证所得模型的方法。化合物可以作为结构输入,使用免费的化学信息学前端PowerMV从中计算标准描述符;PowerMV还支持化合物可视化。此外,用户可以直接输入自己选择的描述符,因此比较描述符的能力实际上是无限的。统计方法包括一系列全面的方法,这些方法的有效性和实用性已得到该领域专家的认可。这些工具尽可能在链接到灵活的R平台的开源软件中实现,赋予用户以无缝方式应用许多不同QSAR建模方法的能力。随着统计和数据挖掘领域出现有前景的新QSAR方法,它们将被纳入该实验室。该网站还包含指向公共领域数据集的链接,这些数据集可用作所提出的新建模方法的测试用例。使用各种生物反应展示了ChemModLab的功能,并对每种反应应用了不同的建模方法。这些展示了拟合的QSAR模型在质量和计算要求方面的明显差异。该实验室基于网络,免费使用。拥有新的测定数据、新的描述符集或新建模方法的研究人员可以轻松构建QSAR模型,并将其结果与其他研究结果进行比较。用户还可以检查QSAR模型识别出的分子的多样性。此外,用户可以选择将其数据集放置在公共区域以促进与其他研究人员的交流;或者可以将其隐藏以保密。