Grantham Neal S, Reich Brian J, Pacifici Krishna, Laber Eric B, Menninger Holly L, Henley Jessica B, Barberán Albert, Leff Jonathan W, Fierer Noah, Dunn Robert R
Department of Statistics, North Carolina State University, Raleigh, North Carolina, United States of America.
Department of Applied Ecology, North Carolina State University, Raleigh, North Carolina, United States of America.
PLoS One. 2015 Apr 13;10(4):e0122605. doi: 10.1371/journal.pone.0122605. eCollection 2015.
There is a long history of archaeologists and forensic scientists using pollen found in a dust sample to identify its geographic origin or history. Such palynological approaches have important limitations as they require time-consuming identification of pollen grains, a priori knowledge of plant species distributions, and a sufficient diversity of pollen types to permit spatial or temporal identification. We demonstrate an alternative approach based on DNA sequencing analyses of the fungal diversity found in dust samples. Using nearly 1,000 dust samples collected from across the continental U.S., our analyses identify up to 40,000 fungal taxa from these samples, many of which exhibit a high degree of geographic endemism. We develop a statistical learning algorithm via discriminant analysis that exploits this geographic endemicity in the fungal diversity to correctly identify samples to within a few hundred kilometers of their geographic origin with high probability. In addition, our statistical approach provides a measure of certainty for each prediction, in contrast with current palynology methods that are almost always based on expert opinion and devoid of statistical inference. Fungal taxa found in dust samples can therefore be used to identify the origin of that dust and, more importantly, we can quantify our degree of certainty that a sample originated in a particular place. This work opens up a new approach to forensic biology that could be used by scientists to identify the origin of dust or soil samples found on objects, clothing, or archaeological artifacts.
考古学家和法医科学家利用灰尘样本中发现的花粉来确定其地理来源或历史已有很长的历史。这种孢粉学方法有重要的局限性,因为它们需要耗时地鉴定花粉粒、植物物种分布的先验知识,以及足够多样的花粉类型以进行空间或时间鉴定。我们展示了一种基于对灰尘样本中发现的真菌多样性进行DNA测序分析的替代方法。利用从美国大陆各地收集的近1000个灰尘样本,我们的分析从这些样本中鉴定出多达40000个真菌分类群,其中许多表现出高度的地理特有性。我们通过判别分析开发了一种统计学习算法,利用真菌多样性中的这种地理特有性,以高概率将样本正确鉴定到距离其地理来源几百公里以内的范围。此外,与目前几乎总是基于专家意见且缺乏统计推断的孢粉学方法不同,我们的统计方法为每个预测提供了确定性的度量。因此,灰尘样本中发现的真菌分类群可用于确定该灰尘的来源,更重要的是,我们可以量化我们对样本起源于特定地点的确定程度。这项工作开辟了一种法医生物学的新方法,科学家可利用它来确定在物体、衣物或考古文物上发现的灰尘或土壤样本的来源。