Suppr超能文献

一种通用近红外定量模型的训练集选择策略。

A training set selection strategy for a universal near-infrared quantitative model.

机构信息

Institute of Medicinal Biotechnology, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, People's Republic of China.

出版信息

AAPS PharmSciTech. 2011 Jun;12(2):738-45. doi: 10.1208/s12249-011-9638-6. Epub 2011 Jun 4.

Abstract

The purpose of this article is to propose an empirical solution to the problem of how many clusters of complex samples should be selected to construct the training set for a universal near infrared quantitative model based on the Naes method. The sample spectra were hierarchically classified into clusters by Ward's algorithm and Euclidean distance. If the sample spectra were classified into two clusters, the 1/50 of the largest Heterogeneity value in the cluster with larger variation was set as the threshold to determine the total number of clusters. One sample was then randomly selected from each cluster to construct the training set, and the number of samples in training set equaled the number of clusters. In this study, 98 batches of rifampicin capsules with API contents ranging from 50.1% to 99.4% were studied with this strategy. The root mean square errors of cross validation and prediction were 2.54% and 2.31% for the model for rifampicin capsules, respectively. Then, we evaluated this model in terms of outlier diagnostics, accuracy, precision, and robustness. We also used the strategy of training set sample selection to revalidate the models for cefradine capsules, roxithromycin tablets, and erythromycin ethylsuccinate tablets, and the results were satisfactory. In conclusion, all results showed that this training set sample selection strategy assisted in the quick and accurate construction of quantitative models using near-infrared spectroscopy.

摘要

本文旨在提出一种经验解决方案,以解决基于 Naes 方法构建通用近红外定量模型的训练集应选择多少个复杂样本簇的问题。采用 Ward 算法和欧几里得距离对样品光谱进行层次聚类。如果样品光谱分为两类,则将变化较大的类中最大异质性值的 1/50 设定为阈值,以确定总簇数。然后从每个簇中随机选择一个样品来构建训练集,训练集的样品数等于簇数。本研究采用该策略对 98 批 API 含量为 50.1%至 99.4%的利福平胶囊进行了研究。利福平胶囊模型的交叉验证和预测均方根误差分别为 2.54%和 2.31%。然后,我们从异常值诊断、准确性、精密度和稳健性方面评估了该模型。我们还使用训练集样品选择策略重新验证了头孢拉定胶囊、罗红霉素片和琥乙红霉素片的模型,结果令人满意。总之,所有结果均表明,该训练集样品选择策略有助于快速准确地构建近红外光谱定量模型。

相似文献

本文引用的文献

1
Near-infrared spectroscopy applications in pharmaceutical analysis.近红外光谱在药物分析中的应用。
Talanta. 2007 May 15;72(3):865-83. doi: 10.1016/j.talanta.2006.12.023. Epub 2006 Dec 23.
2
A review of near infrared spectroscopy and chemometrics in pharmaceutical technologies.药物技术中近红外光谱法与化学计量学综述
J Pharm Biomed Anal. 2007 Jul 27;44(3):683-700. doi: 10.1016/j.jpba.2007.03.023. Epub 2007 Mar 30.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验