Department of Physics and Astronomy, University of Kansas, Lawrence, Kansas 66045.
Department of Pharmaceutical Chemistry, University of Kansas, Lawrence, Kansas 66045.
J Pharm Sci. 2017 Nov;106(11):3270-3279. doi: 10.1016/j.xphs.2017.07.013. Epub 2017 Jul 22.
There is growing interest in generating physicochemical and biological analytical data sets to compare complex mixture drugs, for example, products from different manufacturers. In this work, we compare various crofelemer samples prepared from a single lot by filtration with varying molecular weight cutoffs combined with incubation for different times at different temperatures. The 2 preceding articles describe experimental data sets generated from analytical characterization of fractionated and degraded crofelemer samples. In this work, we use data mining techniques such as principal component analysis and mutual information scores to help visualize the data and determine discriminatory regions within these large data sets. The mutual information score identifies chemical signatures that differentiate crofelemer samples. These signatures, in many cases, would likely be missed by traditional data analysis tools. We also found that supervised learning classifiers robustly discriminate samples with around 99% classification accuracy, indicating that mathematical models of these physicochemical data sets are capable of identifying even subtle differences in crofelemer samples. Data mining and machine learning techniques can thus identify fingerprint-type attributes of complex mixture drugs that may be used for comparative characterization of products.
人们越来越感兴趣的是生成物理化学和生物分析数据集,以比较复杂的混合物药物,例如,来自不同制造商的产品。在这项工作中,我们比较了通过不同的过滤分子量截止值和不同温度下不同时间的孵育,从单一批次制备的各种变色酸样品。前两篇文章描述了从分馏和降解变色酸样品的分析特性生成的实验数据集。在这项工作中,我们使用数据挖掘技术,如主成分分析和互信息评分,帮助可视化数据,并确定这些大数据集中的区分区域。互信息评分确定了区分变色酸样品的化学特征。在许多情况下,这些特征很可能会被传统的数据分析工具所忽略。我们还发现,有监督的学习分类器可以稳健地识别分类准确率约为 99%的样品,这表明这些物理化学数据集的数学模型能够识别变色酸样品中的细微差异。因此,数据挖掘和机器学习技术可以识别复杂混合物药物的指纹类型属性,这些属性可用于产品的比较特征描述。