Department of Ecology, Biogeochemistry and Environmental Protection, Wrocław University, ul. Kanonia 6/8, 50-328, Wroclaw, Poland.
J Chem Ecol. 2010 Sep;36(9):1029-34. doi: 10.1007/s10886-010-9832-0. Epub 2010 Aug 6.
In this study, the novel data mining technique Market Basket Analysis (MBA) was applied for the first time in biogeochemical and ecological investigations. The method was tested on the fern Athyrium distentifolium, in which we measured concentrations of the elements Ca, Cd, Cr, Cu, Fe, K, Mg, Mn, Na, Ni, Pb, and Zn. Plants were sampled from sites with different types of bedrock in the Tatra National Park in Poland. MBA was used to investigate whether specimens of Athyrium distentifolium that contain elevated levels of certain elements occur more frequently on a specific type of bedrock and to identify relationships between the type of bedrock and the concentrations of the elements in this fern. The results were compared with those of the commonly used principal component and classification analysis (PCCA) technique. MBA and PCCA ordination both yielded distinct groups of ferns growing on different types of bedrock. Although the results of MBA and PCCA were similar, MBA has the advantage of being independent of the size of the data set. In addition, MBA revealed not only dominant elements but, in the case of limestone bedrock, also showed very low concentrations of Cd, Fe, Mn, and Pb in ferns growing on this type of parent material. MBA, thus, appeared to be a promising data mining method to reveal chemical relations in the environment as well as the accumulation of chemical elements in bioindicators. This technique can be used to reveal associations and correlations among items in large data sets collected on a national or even larger scale.
在这项研究中,市场篮子分析(MBA)这一新颖的数据挖掘技术首次应用于生物地球化学和生态学研究。该方法在蹄盖蕨属植物(Athyrium distentifolium)上进行了测试,我们在其中测量了钙(Ca)、镉(Cd)、铬(Cr)、铜(Cu)、铁(Fe)、钾(K)、镁(Mg)、锰(Mn)、钠(Na)、镍(Ni)、铅(Pb)和锌(Zn)等元素的浓度。植物样本采自波兰塔特拉国家公园不同类型基岩的地点。MBA 用于调查是否含有某些元素水平升高的蹄盖蕨属植物标本更频繁地出现在特定类型的基岩上,并确定基岩类型与该蕨类植物中元素浓度之间的关系。结果与常用的主成分和分类分析(PCCA)技术进行了比较。MBA 和 PCCA 排序都产生了在不同类型基岩上生长的明显蕨类植物群。尽管 MBA 和 PCCA 的结果相似,但 MBA 具有不依赖于数据集大小的优势。此外,MBA 不仅揭示了优势元素,而且在石灰岩基岩的情况下,还显示了在这种母质上生长的蕨类植物中 Cd、Fe、Mn 和 Pb 的浓度非常低。因此,MBA 似乎是一种很有前途的数据挖掘方法,可以揭示环境中的化学关系以及生物指示剂中化学元素的积累。该技术可用于揭示在全国甚至更大范围内收集的大型数据集项目之间的关联和相关性。