Agronomie et Environnement, UMR 1121, Université de Lorraine Vandoeuvre-lès-Nancy, F-54500, France ; Agronomie et Environnement, UMR 1121, INRA Colmar, F-6800, France ; Agroscope Reckenholz-Tänikon Research Station ART Zurich, Switzerland.
Agronomie et Environnement, UMR 1121, Université de Lorraine Vandoeuvre-lès-Nancy, F-54500, France ; Agronomie et Environnement, UMR 1121, INRA Colmar, F-6800, France.
Ecol Evol. 2014 Apr;4(7):944-58. doi: 10.1002/ece3.989. Epub 2014 Feb 25.
Functional trait databases are powerful tools in ecology, though most of them contain large amounts of missing values. The goal of this study was to test the effect of imputation methods on the evaluation of trait values at species level and on the subsequent calculation of functional diversity indices at community level using functional trait databases. Two simple imputation methods (average and median), two methods based on ecological hypotheses, and one multiple imputation method were tested using a large plant trait database, together with the influence of the percentage of missing data and differences between functional traits. At community level, the complete-case approach and three functional diversity indices calculated from grassland plant communities were included. At the species level, one of the methods based on ecological hypothesis was for all traits more accurate than imputation with average or median values, but the multiple imputation method was superior for most of the traits. The method based on functional proximity between species was the best method for traits with an unbalanced distribution, while the method based on the existence of relationships between traits was the best for traits with a balanced distribution. The ranking of the grassland communities for their functional diversity indices was not robust with the complete-case approach, even for low percentages of missing data. With the imputation methods based on ecological hypotheses, functional diversity indices could be computed with a maximum of 30% of missing data, without affecting the ranking between grassland communities. The multiple imputation method performed well, but not better than single imputation based on ecological hypothesis and adapted to the distribution of the trait values for the functional identity and range of the communities. Ecological studies using functional trait databases have to deal with missing data using imputation methods corresponding to their specific needs and making the most out of the information available in the databases. Within this framework, this study indicates the possibilities and limits of single imputation methods based on ecological hypothesis and concludes that they could be useful when studying the ranking of communities for their functional diversity indices.
功能性状数据库是生态学中的有力工具,但它们大多数都包含大量缺失值。本研究的目的是测试填补方法对物种水平上性状值评估以及随后在群落水平上计算功能多样性指数的影响,使用功能性状数据库。我们测试了两种简单的填补方法(平均值和中位数)、两种基于生态假设的方法和一种多重填补方法,同时考虑了缺失数据的百分比和功能性状之间的差异。在群落水平上,我们纳入了完整案例方法和三个基于草原植物群落计算的功能多样性指数。在物种水平上,基于生态假设的一种方法对于所有性状都比平均值或中位数填补更准确,但对于大多数性状来说,多重填补方法更优越。基于物种之间功能接近程度的方法对于分布不平衡的性状是最佳方法,而基于性状之间存在关系的方法对于分布平衡的性状是最佳方法。使用完整案例方法对草原群落进行功能多样性指数排序并不稳健,即使缺失数据的百分比较低也是如此。使用基于生态假设的填补方法,可以在缺失数据最多达到 30%的情况下计算功能多样性指数,而不会影响草原群落之间的排序。多重填补方法表现良好,但不如基于生态假设和适应性状值分布的单一填补方法优越。使用功能性状数据库进行生态研究时,必须使用与特定需求相匹配的填补方法来处理缺失数据,并充分利用数据库中可用的信息。在这个框架内,本研究表明了基于生态假设的单一填补方法的可能性和局限性,并得出结论,当研究群落的功能多样性指数排序时,它们可能是有用的。