Laboratory of Chemometrics, Department of Chemistry, University of Perugia, Via Elce di Sotto 10, I-60123 Perugia, Italy.
J Chem Inf Model. 2012 Sep 24;52(9):2462-70. doi: 10.1021/ci3002809. Epub 2012 Sep 4.
P-Glycoprotein (Pgp) is involved in the elimination and in the disposition of a significant portion of marketed drugs. So far, publicly available data sets used for modeling Pgp transport included compounds tested in different assays, different cell lines, and different protocols. In this work, we present a collection of 478 Efflux Ratios (ERs) in MDCK-MDR1 cell lines, and from this collection we define a data set of 187 compounds that were tested in the Borst-derived MDCK-MDR1 cell lines. Of the 23 models resulting from the use of different descriptors, classification algorithms, and variable selection techniques, the 4 most accurate in external validation (∼0.86) are based on VolSurf+ (VS+) descriptors. Two of these models are Naïve Bayes (NB) classifiers using 4 descriptors that were selected through a new technique hereby first time extensively described.
P-糖蛋白(Pgp)参与了相当一部分市售药物的消除和处置。到目前为止,用于建模 Pgp 转运的公开可用数据集包括在不同测定、不同细胞系和不同方案中测试的化合物。在这项工作中,我们提供了一个在 MDCK-MDR1 细胞系中进行的 478 个外排比(ER)的集合,并且从这个集合中,我们定义了一个在 Borst 衍生的 MDCK-MDR1 细胞系中测试的 187 种化合物的数据集合。在所使用的不同描述符、分类算法和变量选择技术产生的 23 个模型中,在外部验证中最准确的(约 0.86)是基于 VolSurf+(VS+)描述符的。这两个模型都是使用通过一种新的技术选择的 4 个描述符的朴素贝叶斯(NB)分类器,该技术在此处首次得到了广泛的描述。