Paakkunainen Maaret, Kilpeläinen Jarmo, Reinikainen Satu-Pia, Minkkinen Pentti
Lappeenranta University of Technology, Department of Chemical Technology, P.O. Box 20, 53851 Lappeenranta, Finland.
Anal Chim Acta. 2007 Jul 9;595(1-2):209-15. doi: 10.1016/j.aca.2007.01.020. Epub 2007 Jan 16.
Sampling and uncertainty of sampling are important tasks, when industrial processes are monitored. Missing values and unequal sources can cause problems in almost all industrial fields. One major problem is that during weekends samples may not be collected. On the other hand a composite sample may be collected during weekend. These systematically occurring missing values (gaps) will have an effect on the uncertainties of the measurements. Another type of missing values is random missing values. These random gaps are caused, for example, by instrument failures. Pierre Gy's sampling theory includes tools to evaluate all error components that are involved in sampling of heterogeneous materials. Variograms, introduced by Gy's sampling theory, have been developed to estimate the uncertainty of auto-correlated process measurements. Variographic experiments are utilized for estimating the variance for different sample selection strategies. The different sample selection strategies are random sampling, stratified random sampling and systematic sampling. In this paper both systematic and random gaps were estimated by using simulations and real process data. These process data were taken from bark boilers of pulp and paper mills (combustion processes). When systematic gaps were examined a linear interpolation was utilized. Also cases introducing composite sampling were studied. Aims of this paper are: (1) how reliable the variogram is to estimate the process variogram calculated from data with systematic gaps, (2) how the uncertainty of missing gap can be estimated in reporting time-averages of auto-correlated time series measurements. The results show that when systematic gaps were filled by linear interpolation only minor changes in the values of variogram were observed. The differences between the variograms were constantly smallest with composite samples. While estimating the effect of random gaps, the results show that for the non-periodic processes the stratified random sampling strategy gives more reliable results than systematic sampling strategy. Therefore stratified random sampling should be used while estimating the uncertainty of random gaps in reporting time-averages of auto-correlated time series measurements.
在监测工业过程时,采样及采样的不确定性是重要任务。缺失值和不均衡的数据源几乎会在所有工业领域引发问题。一个主要问题是,周末可能不会采集样本。另一方面,可能会在周末采集混合样本。这些系统性出现的缺失值(间隙)会对测量的不确定性产生影响。另一类缺失值是随机缺失值。这些随机间隙例如是由仪器故障导致的。皮埃尔·吉的采样理论包含评估异质材料采样中涉及的所有误差成分的工具。吉的采样理论引入的变差函数已得到发展,用于估计自相关过程测量的不确定性。变差函数实验用于估计不同样本选择策略的方差。不同的样本选择策略包括随机采样、分层随机采样和系统采样。本文通过模拟和实际过程数据对系统性和随机间隙进行了估计。这些过程数据取自纸浆和造纸厂的树皮锅炉(燃烧过程)。在检查系统性间隙时采用了线性插值法。还研究了引入混合采样的情况。本文的目的是:(1)变差函数在估计由存在系统性间隙的数据计算出的过程变差函数时的可靠性如何,(2)在报告自相关时间序列测量的时间平均值时,如何估计缺失间隙的不确定性。结果表明,当通过线性插值填补系统性间隙时,变差函数值仅出现微小变化。混合样本的变差函数之间的差异始终最小。在估计随机间隙的影响时,结果表明,对于非周期性过程,分层随机采样策略比系统采样策略能给出更可靠的结果。因此,在报告自相关时间序列测量的时间平均值时,估计随机间隙的不确定性时应采用分层随机采样。