Zenkevich I G, Babushok V I, Linstrom P J, White E V, Stein S E
National Institute of Standards and Technology, Gaithersburg, MD 20899-8320, USA.
J Chromatogr A. 2009 Sep 18;1216(38):6651-61. doi: 10.1016/j.chroma.2009.07.065. Epub 2009 Aug 3.
The effective use of gas chromatographic retention data presented in the form of retention indices (RI) requires the development of a comprehensive structure-based digital archive of retention parameters. Development of such an archive includes the collection of all available RI values for a variety of compounds including replicates measured under slightly different conditions. Review of retention data often shows a relatively wide range of RI values for certain well studied compounds that is larger than expected on the basis of the simple reproducibility of experimental measurements. The finding of unusual RI data distributions and their examination presents a possible way to detect and correct errors during the development of comprehensive RI libraries. Our approach involves the construction of histograms representing the distribution of data-points in various RI intervals. The observed shape of the distribution is compared to the expected and observed shapes for well-identified compounds. Significant systematic deviations represent anomalies in the sets of RI data. The occurrence of more than a single maximum on a histogram generally indicates the presence of erroneous data. For some compounds such multimode RI distributions may be caused by differences in experimental conditions of the RI determination. The construction and interpretation of histograms for compounds with multiple RI measurements is illustrated by several examples. Thus, the RI sub-set for the diterpene alcohol, isophytol, was separated from the RI data set for phytol and four additional sub-groups of published RI data for one of the sesquiterpenes, gamma-elemene, were re-identified as alpha-, beta-, delta-elemenes and germacrene B.
有效利用以保留指数(RI)形式呈现的气相色谱保留数据,需要建立一个基于结构的全面的保留参数数字存档库。建立这样一个存档库包括收集各种化合物的所有可用RI值,包括在略有不同条件下测量的重复值。对保留数据的审查经常表明,某些经过充分研究的化合物的RI值范围相对较宽,比基于实验测量的简单可重复性预期的要大。发现异常的RI数据分布并对其进行检查,为在全面的RI库开发过程中检测和纠正错误提供了一种可能的方法。我们的方法包括构建直方图,以表示不同RI区间内数据点的分布。将观察到的分布形状与已明确化合物的预期和观察形状进行比较。显著的系统偏差表示RI数据集中存在异常。直方图上出现多个最大值通常表明存在错误数据。对于某些化合物,这种多峰RI分布可能是由RI测定实验条件的差异引起的。通过几个例子说明了具有多个RI测量值的化合物的直方图的构建和解释。因此,二萜醇异植物醇的RI子集从植物醇的RI数据集中分离出来,并且对于倍半萜之一γ-榄香烯公布的RI数据的另外四个子组被重新鉴定为α-、β-、δ-榄香烯和吉马烯B。