Rahimi Ehsan, Jung Chuleui
Agricultural Research Institute, GyeongKuk National University, Andong 36729, Republic of Korea.
Department of Plant Medical, GyeongKuk National University, Andong 36729, Republic of Korea.
Insects. 2025 Jul 26;16(8):769. doi: 10.3390/insects16080769.
Research in biogeography, ecology, and biodiversity hinges on the availability of comprehensive datasets that detail species distributions and environmental conditions. At the forefront of this endeavor is the Global Biodiversity Information Facility (GBIF). This study focuses on investigating spatial biases and temporal trends in insect pollinator occurrence data within the GBIF dataset, specifically focusing on three pivotal pollinator groups: bees, hoverflies, and butterflies. Addressing these gaps in GBIF data is essential for comprehensive analyses and informed pollinator conservation efforts. We obtained occurrence data from GBIF for seven bee families, six butterfly families, and the Syrphidae family of hoverflies in 2024. Spatial biases were addressed by eliminating duplicate records with identical latitude and longitude coordinates. Species richness was assessed for each family and country. Temporal trends were examined by tallying annual occurrence records for each pollinator family, and the diversity of data sources within GBIF was evaluated by quantifying unique data publishers. We identified initial occurrence counts of 4,922,390 for bees, 1,703,131 for hoverflies, and 31,700,696 for butterflies, with a substantial portion containing duplicate records. On average, 81.4% of bee data, 77.2% of hoverfly data, and 65.4% of butterfly data were removed post-duplicate elimination for dataset refinement. Our dataset encompassed 9286 unique bee species, 2574 hoverfly species, and 17,895 butterfly species. Our temporal analysis revealed a notable trend in data recording, with 80% of bee and butterfly data collected after 2022, and a similar threshold for hoverflies reached after 2023. The United States, Germany, the United Kingdom, and Sweden consistently emerged as the top countries for occurrence data across all three groups. The analysis of data publishers highlighted iNaturalist.org as a top contributor to bee data. Overall, we uncovered significant biases in the occurrence data of pollinators from GBIF. These biases pose substantial challenges for future research on pollinator ecology and biodiversity conservation.
生物地理学、生态学和生物多样性方面的研究依赖于详细记录物种分布和环境条件的综合数据集。全球生物多样性信息机构(GBIF)处于这项工作的前沿。本研究聚焦于调查GBIF数据集中昆虫传粉者出现数据的空间偏差和时间趋势,特别关注三个关键传粉者群体:蜜蜂、食蚜蝇和蝴蝶。解决GBIF数据中的这些差距对于全面分析和明智的传粉者保护工作至关重要。我们于2024年从GBIF获取了七个蜜蜂科、六个蝴蝶科和食蚜蝇科的出现数据。通过消除具有相同经纬度坐标的重复记录来解决空间偏差问题。对每个科和国家的物种丰富度进行了评估。通过统计每个传粉者科的年度出现记录来研究时间趋势,并通过量化独特的数据发布者来评估GBIF内数据源的多样性。我们确定蜜蜂的初始出现记录数为4922390条,食蚜蝇为1703131条,蝴蝶为31700696条,其中很大一部分包含重复记录。平均而言,为完善数据集,在消除重复记录后,分别有81.4%的蜜蜂数据、77.2%的食蚜蝇数据和65.4%的蝴蝶数据被移除。我们的数据集包含9286种独特的蜜蜂物种、2574种食蚜蝇物种和17895种蝴蝶物种。我们的时间分析揭示了数据记录中的一个显著趋势,即80%的蜜蜂和蝴蝶数据是在2022年之后收集的,食蚜蝇的数据在2023年之后达到了类似的阈值。美国、德国、英国和瑞典一直是所有这三个群体出现数据的主要国家。对数据发布者的分析突出显示iNaturalist.org是蜜蜂数据的主要贡献者。总体而言,我们发现GBIF中传粉者出现数据存在重大偏差。这些偏差给未来传粉者生态学和生物多样性保护研究带来了巨大挑战。