Kerzenmacher Tobias E, Keckhut Philippe, Hauchecorne Alain, Chanin Marie-Lise
Service d'Aéronomie, Institut Pierre Simon Laplace, B.P. 3, 91371, Verrières-le-Buisson, France.
J Environ Monit. 2006 Jul;8(7):682-90. doi: 10.1039/b603750j. Epub 2006 Jun 6.
Multi-regression analyses have often been used recently to detect trends, in particular in ozone or temperature data sets in the stratosphere. The confidence in detecting trends depends on a number of factors which generate uncertainties. Part of these uncertainties comes from the random variability and these are what is usually considered. They can be statistically estimated from residual deviations between the data and the fitting model. However, interferences between different sources of variability affecting the data set, such as the Quasi-Biennal Oscillation (QBO), volcanic aerosols, solar flux variability and the trend can also be a critical source of errors. This type of error has hitherto not been well quantified. In this work an artificial data series has been generated to carry out such estimates. The sources of errors considered here are: the length of the data series, the dependence on the choice of parameters used in the fitting model and the time evolution of the trend in the data series. Curves provided here, will permit future studies to test the magnitude of the methodological bias expected for a given case, as shown in several real examples. It is found that, if the data series is shorter than a decade, the uncertainties are very large, whatever factors are chosen to identify the source of the variability. However the errors can be limited when dealing with natural variability, if a sufficient number of periods (for periodic forcings) are covered by the analysed dataset. However when analysing the trend, the response to volcanic eruption induces a bias, whatever the length of the data series. The signal to noise ratio is a key factor: doubling the noise increases the period for which data is required in order to obtain an error smaller than 10%, from 1 to 3-4 decades. Moreover, if non-linear trends are superimposed on the data, and if the length of the series is longer than five years, a non-linear function has to be used to estimate trends. When applied to real data series, and when a breakpoint in the series occurs, the study reveals that data extending over 5 years are needed to detect a significant change in the slope of the ozone trends at mid-latitudes.
最近,多元回归分析经常被用于检测趋势,特别是平流层中臭氧或温度数据集的趋势。检测趋势的可信度取决于许多产生不确定性的因素。这些不确定性部分来自随机变异性,而这通常是被考虑的因素。它们可以从数据与拟合模型之间的残余偏差进行统计估计。然而,影响数据集的不同变异性来源之间的干扰,如准两年振荡(QBO)、火山气溶胶、太阳通量变异性和趋势,也可能是关键的误差来源。迄今为止,这类误差尚未得到很好的量化。在这项工作中,生成了一个人工数据序列来进行此类估计。这里考虑的误差来源包括:数据序列的长度、对拟合模型中使用的参数选择的依赖性以及数据序列中趋势的时间演变。这里提供的曲线将使未来的研究能够测试给定情况下预期的方法偏差的大小,如在几个实际例子中所示。研究发现,如果数据序列短于十年,无论选择何种因素来确定变异性来源,不确定性都非常大。然而,如果分析的数据集中涵盖了足够数量的周期(对于周期性强迫而言),那么在处理自然变异性时误差可以得到限制。然而,在分析趋势时,无论数据序列的长度如何,对火山喷发的响应都会引起偏差。信噪比是一个关键因素:将噪声翻倍会使为了获得小于10%的误差而所需的数据周期从1年增加到3至4十年。此外,如果非线性趋势叠加在数据上,并且序列长度超过五年,则必须使用非线性函数来估计趋势。当应用于实际数据序列且序列中出现断点时,该研究表明需要超过5年的数据来检测中纬度地区臭氧趋势斜率的显著变化。