Sheikh Hassan Aftab, Singh Alok, Kushwaha Neetu, Christiaen Christophe, Tkachenko Nataliya, Sabuco Juan, Caldecott Ben
Smith School of Enterprise and the Environment, University of Oxford, South Parks Road, Oxford, OX1 3QY, UK.
AI Centre of Excellence, Chief Data and AI Office, Lloyds Banking Group, London, EC2V 7HN, UK.
Sci Data. 2025 Jul 15;12(1):1240. doi: 10.1038/s41597-025-05521-8.
Agriculture sector is a major contributor to greenhouse gas emissions, yet the lack of asset-level farm data, including ownership, land use, and production, hinders effective transition finance and decarbonisation efforts. To address this gap, we developed an open-source farm-level dataset using natural language processing (NLP) and unsupervised learning, mapping farm names to spatial polygons to fill ownership and entity gaps. In England, this approach identified 117,116 farming entities with essential attributes such as addresses, land areas, crop types, production output, and geospatial coordinates. Such emerging datasets are also critical for financial instruments supporting sustainable agriculture, enabling verification of carbon credits, enhance sustainability-linked loans and improve risk assessment for climate finance.
农业部门是温室气体排放的主要贡献者,但缺乏包括所有权、土地使用和生产在内的资产层面农场数据,这阻碍了有效的转型融资和脱碳努力。为了填补这一空白,我们利用自然语言处理(NLP)和无监督学习开发了一个开源农场层面数据集,将农场名称映射到空间多边形,以填补所有权和实体空白。在英国,这种方法识别出了117116个农业实体,这些实体具有地址、土地面积、作物类型、产量和地理空间坐标等基本属性。此类新兴数据集对于支持可持续农业的金融工具也至关重要,有助于碳信用额度的核查、加强与可持续性挂钩的贷款,并改善气候融资的风险评估。