Dastres Emran, Sonboli Ali, Esmaeili Hassan, Mirjalili Mohammad Hossein, Edalat Mohsen
Department of Agriculture, Medicinal Plants and Drugs Research Institute, Shahid Beheshti University, Tehran, 1983969411, Iran.
Department of Biology, Medicinal Plants and Drugs Research Institute, Shahid Beheshti University, Tehran, 1983969411, Iran.
Sci Rep. 2025 Aug 27;15(1):31535. doi: 10.1038/s41598-025-17039-5.
Nepeta persica is a medicinal plant with significant pharmacological potential, primarily attributed to its high nepetalactone content. Understanding the environmental drivers of nepetalactone biosynthesis is essential for optimizing both cultivation and conservation strategies. In this study, we combined machine learning algorithms (random forest, support vector machines, gradient boosting machines) with a hybrid ensemble model (RF-SVM-GBM), alongside statistical approaches (generalized linear models [GLM] and partial least squares [PLS]) and geospatial analyses (GIS, remote sensing, habitat suitability modeling) to assess the influence of climatic, topographic, and edaphic factors on nepetalactone concentration in N. persica across Fars province, Iran. The results identified elevation, south-facing slopes, and mean annual temperature as the most critical determinants of nepetalactone accumulation. The hybrid ensemble model demonstrated the highest predictive accuracy, reducing RMSE by 21.1% (RMSE = 0.015) compared to individual models. Habitat suitability maps revealed Marvdasht and Shiraz counties as the most favorable regions for cultivating N. persica with high nepetalactone concentrations, followed by smaller high-suitability zones in Northeast Firozabad and Northern Kazerun. In contrast, areas such as Abadeh, Eqlid, and Khorrambid exhibited lower suitability. These findings provide actionable insights for precision agriculture, resource-efficient cultivation, and climate-adaptive conservation of medicinal plants. By integrating ecological modeling with machine learning, this research offers a scalable, data-driven framework to support the sustainable production of high-value secondary metabolites in environmentally challenging regions.
波斯荆芥是一种具有重要药理潜力的药用植物,这主要归因于其高含量的荆芥内酯。了解荆芥内酯生物合成的环境驱动因素对于优化种植和保护策略至关重要。在本研究中,我们将机器学习算法(随机森林、支持向量机、梯度提升机)与混合集成模型(RF-SVM-GBM)相结合,同时运用统计方法(广义线性模型[GLM]和偏最小二乘法[PLS])以及地理空间分析(GIS、遥感、栖息地适宜性建模),以评估气候、地形和土壤因素对伊朗法尔斯省波斯荆芥中荆芥内酯浓度的影响。结果确定海拔、朝南的斜坡和年平均温度是荆芥内酯积累的最关键决定因素。混合集成模型显示出最高的预测准确性,与单个模型相比,均方根误差(RMSE)降低了21.1%(RMSE = 0.015)。栖息地适宜性地图显示,马尔达什特县和设拉子县是种植高荆芥内酯浓度波斯荆芥的最适宜地区,其次是菲罗扎巴德东北部和卡泽伦北部较小的高适宜性区域。相比之下,阿巴德、埃克利德和霍拉姆比德等地的适宜性较低。这些发现为精准农业、资源高效种植和药用植物的气候适应性保护提供了可操作的见解。通过将生态建模与机器学习相结合,本研究提供了一个可扩展的、数据驱动的框架,以支持在环境具有挑战性的地区可持续生产高价值次生代谢产物。