Department of Organismic and Evolutionary Biology, Harvard University, 22 Divinity Avenue, Cambridge, Massachusetts 02138, USA.
Ecol Appl. 2013 Jan;23(1):273-86. doi: 10.1890/12-0747.1.
Primarily driven by concern about rising levels of atmospheric CO2, ecologists and earth system scientists are collecting vast amounts of data related to the carbon cycle. These measurements are generally time consuming and expensive to make, and, unfortunately, we live in an era where research funding is increasingly hard to come by. Thus, important questions are: "Which data streams provide the most valuable information?" and "How much data do we need?" These questions are relevant not only for model developers, who need observational data to improve, constrain, and test their models, but also for experimentalists and those designing ecological observation networks. Here we address these questions using a model-data fusion approach. We constrain a process-oriented, forest ecosystem C cycle model with 17 different data streams from the Harvard Forest (Massachusetts, USA). We iteratively rank each data source according to its contribution to reducing model uncertainty. Results show the importance of some measurements commonly unavailable to carbon-cycle modelers, such as estimates of turnover times from different carbon pools. Surprisingly, many data sources are relatively redundant in the presence of others and do not lead to a significant improvement in model performance. A few select data sources lead to the largest reduction in parameter-based model uncertainty. Projections of future carbon cycling were poorly constrained when only hourly net-ecosystem-exchange measurements were used to inform the model. They were well constrained, however, with only 5 of the 17 data streams, even though many individual parameters are not constrained. The approach taken here should stimulate further cooperation between modelers and measurement teams and may be useful in the context of setting research priorities and allocating research funds.
主要由于对大气 CO2 水平不断上升的担忧,生态学家和地球系统科学家正在收集大量与碳循环相关的数据。这些测量通常需要花费大量的时间和资金,而且不幸的是,我们生活在一个研究资金越来越难以获得的时代。因此,重要的问题是:“哪些数据流提供了最有价值的信息?”和“我们需要多少数据?”这些问题不仅与模型开发者有关,他们需要观测数据来改进、约束和测试他们的模型,而且与实验家和那些设计生态观测网络的人有关。在这里,我们使用模型-数据融合方法来解决这些问题。我们用 17 种来自哈佛森林(美国马萨诸塞州)的不同数据流来约束一个面向过程的森林生态系统 C 循环模型。我们根据对减少模型不确定性的贡献,迭代地对每个数据源进行排名。结果表明了一些通常不为碳循环模型提供的测量的重要性,例如来自不同碳库的周转率估计。令人惊讶的是,在存在其他数据来源的情况下,许多数据来源相对冗余,不会导致模型性能的显著提高。少数精选的数据来源导致基于参数的模型不确定性的最大减少。当仅使用每小时净生态系统交换测量来为模型提供信息时,对未来碳循环的预测受到严重限制。然而,即使许多单个参数不受约束,仅使用 17 个数据流中的 5 个,就可以很好地约束这些预测。这里采用的方法应该会促进模型开发者和测量团队之间的进一步合作,并在确定研究优先事项和分配研究资金方面可能是有用的。