Evolution & Ecology Research Centre and School of Biological, Earth and Environmental Sciences, University of New South Wales, Sydney 2052, Australia.
Gigascience. 2022 Aug 13;11. doi: 10.1093/gigascience/giac074.
Taxonomic bias is a known issue within the field of biology, causing scientific knowledge to be unevenly distributed across species. However, a systematic quantification of the research interest that the scientific community has allocated to individual species remains a big data problem. Scalable approaches are needed to integrate biodiversity data sets and bibliometric methods across large numbers of species. The outputs of these analyses are important for identifying understudied species and directing future research to fill these gaps.
In this study, we used the species h-index to quantity the research interest in 7,521 species of mammals. We tested factors potentially driving species h-index, by using a Bayesian phylogenetic generalized linear mixed model (GLMM). We found that a third of the mammals had a species h-index of zero, while a select few had inflated research interest. Further, mammals with higher species h-index had larger body masses; were found in temperate latitudes; had their humans uses documented, including domestication; and were in lower-risk International Union for Conservation of Nature Red List categories. These results surprisingly suggested that critically endangered mammals are understudied. A higher interest in domesticated species suggested that human use is a major driver and focus in mammalian scientific literature.
Our study has demonstrated a scalable workflow and systematically identified understudied species of mammals, as well as identified the likely drivers of this taxonomic bias in the literature. This case study can become a benchmark for future research that asks similar biological and meta-research questions for other taxa.
分类偏见是生物学领域的一个已知问题,导致科学知识在物种间分布不均。然而,科学界对个别物种的研究兴趣的系统量化仍然是一个大数据问题。需要可扩展的方法来整合生物多样性数据集和文献计量学方法,以涵盖大量物种。这些分析的结果对于确定研究不足的物种和指导未来的研究以填补这些空白非常重要。
在这项研究中,我们使用物种 h 指数来量化 7521 种哺乳动物的研究兴趣。我们使用贝叶斯系统发育广义线性混合模型 (GLMM) 来测试潜在驱动物种 h 指数的因素。我们发现,三分之一的哺乳动物的物种 h 指数为零,而少数几种哺乳动物的研究兴趣过高。此外,具有较高物种 h 指数的哺乳动物具有较大的体重;分布在温带纬度;其人类用途有记录,包括驯化;并且属于国际自然保护联盟濒危物种红色名录的低风险类别。这些结果出人意料地表明,极度濒危的哺乳动物研究不足。对驯化物种的更高兴趣表明,人类的使用是哺乳动物文献中的主要驱动力和焦点。
我们的研究展示了一种可扩展的工作流程,并系统地确定了研究不足的哺乳动物物种,以及确定了文献中这种分类偏见的可能驱动因素。这项案例研究可以成为未来对其他分类群提出类似生物学和元研究问题的研究的基准。