Shimada Shoki, Takeuchi Wataru
Institute of Industrial Science, The University of Tokyo, Meguro-ku, 153-8505, Japan.
Sci Rep. 2025 Jul 22;15(1):26671. doi: 10.1038/s41598-025-11222-4.
The recent development of solar photovoltaics (PV) has generated considerable interest in energy management, appropriate environmental impact assessment, and the seamless integration of PV technology into society. A critical first step in exploring these research opportunities is the creation of a comprehensive PV database that describes the locations and extents of existing PV installations. Automated solar PV detection in satellite remote sensing, based on a machine learning approach, is particularly suitable for studying the characteristics of national-scale solar PV distribution and its impact on the environment. In our study, we first proposed an XGBoost-based solar PV detection with post-processing procedures supported by a dedicated solar PV spectral index. This approach was applied to Sentinel-2 images acquired in 2022 to create a national solar PV database in Japan. The resulting solar PV map showed a high degree of accuracy, with an overall accuracy of 0.984. Our dataset revealed the presence of solar PVs covering a total area of 571 km in Japan. The comparison of PV extents with the land cover map showed that the megawatt-scale solar PV facilities were predominantly located in forested areas, suggesting potential changes to existing forest ecosystems and the local environment at these facility locations. Conversely, smaller megawatt-scale PV systems showed a similar preference for both farmland and forest. PV expansion also contributed to forest fragmentation at forest edge areas. To further investigate these findings, we did the clustering analyses to identify high-concentration PV areas and analyzed the distribution of solar PVs alongside socio-economic and environmental factors using an explainable AI approach based on Shapley values. Through the study, we showed how the established PV dataset can be used to uncover spatial patterns and driving factors of PV deployment. Our results indicate that site selection is influenced by a multitude of variables-such as local environmental conditions, power demand, and installation costs-highlighting the need for well-informed strategies when deploying solar PV. Overall, this study demonstrates the efficacy of integrating machine learning models, spectral indices, and post-processing techniques with satellite remote sensing data to accurately map and analyze solar PV installations. Regular updates of these maps from freely available satellite datasets provide valuable insights for policymakers and stakeholders, enabling data-driven decisions regarding the placement, monitoring, and management of PV systems, and supporting a timely transition to a renewable-powered society.
太阳能光伏(PV)技术的最新发展引发了人们对能源管理、适当的环境影响评估以及光伏技术与社会无缝融合的广泛关注。探索这些研究机会的关键第一步是创建一个全面的光伏数据库,用以描述现有光伏装置的位置和范围。基于机器学习方法的卫星遥感自动太阳能光伏检测,特别适用于研究国家尺度太阳能光伏分布特征及其对环境的影响。在我们的研究中,我们首先提出了一种基于XGBoost的太阳能光伏检测方法,并采用由专用太阳能光伏光谱指数支持的后处理程序。该方法应用于2022年获取的哨兵-2号图像,以创建日本的国家太阳能光伏数据库。生成的太阳能光伏地图显示出高度的准确性,总体准确率为0.984。我们的数据集显示,日本太阳能光伏设施覆盖的总面积为571平方公里。将光伏范围与土地覆盖图进行比较表明,兆瓦级太阳能光伏设施主要位于森林地区,这表明这些设施所在地的现有森林生态系统和当地环境可能发生变化。相反,较小的兆瓦级光伏系统对农田和森林的偏好相似。光伏扩展也导致了森林边缘地区的森林碎片化。为了进一步研究这些发现,我们进行了聚类分析以识别高浓度光伏区域,并使用基于Shapley值的可解释人工智能方法分析了太阳能光伏与社会经济和环境因素的分布情况。通过这项研究,我们展示了如何利用已建立的光伏数据集来揭示光伏部署的空间模式和驱动因素。我们的结果表明,选址受到多种变量的影响,如当地环境条件、电力需求和安装成本,这突出了在部署太阳能光伏时制定明智策略的必要性。总体而言,本研究证明了将机器学习模型、光谱指数和后处理技术与卫星遥感数据相结合,以准确绘制和分析太阳能光伏装置的有效性。从免费提供的卫星数据集中定期更新这些地图,为政策制定者和利益相关者提供了有价值的见解,有助于就光伏系统的布局、监测和管理做出数据驱动的决策,并支持及时向可再生能源驱动的社会过渡。