Tang Zhong, You Ting-Ting, Li Ya-Fang, Tang Zhi-Xian, Bao Miao-Qing, Dong Ge, Xu Zhong-Rui, Wang Peng, Zhao Fang-Jie
State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing, 210095, China.
State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing, 210095, China; Centre for Agriculture and Health, Academy for Advanced Interdisciplinary Studies, Nanjing Agricultural University, Nanjing, 210095, China.
Environ Pollut. 2023 Jun 1;326:121501. doi: 10.1016/j.envpol.2023.121501. Epub 2023 Mar 22.
Excessive accumulation of cadmium (Cd) in rice grains threatens food safety and human health. Growing low Cd accumulating rice cultivars is an effective approach to produce low-Cd rice. However, field screening of low-Cd rice cultivars is laborious, time-consuming, and subjected to the influence of environment × genotype interactions. In the present study, we investigated whether machine learning-based methods incorporating genotype and soil Cd concentration can identify high and low-Cd accumulating rice cultivars. One hundred and sixty-seven locally adapted high-yielding rice cultivars were grown in three fields with different soil Cd levels and genotyped using four molecular markers related to grain Cd accumulation. We identified sixteen cultivars as stable low-Cd accumulators with grain Cd concentrations below the 0.2 mg kg food safety limit in all three paddy fields. In addition, we developed eight machine learning-based models to predict low- and high-Cd accumulating rice cultivars with genotypes and soil Cd levels as input data. The optimized model classifies low- or high-Cd cultivars (i.e., the grain Cd concentration below or above 0.2 mg kg) with an overall accuracy of 76%. These results indicate that machine learning-based classification models constructed with molecular markers and soil Cd levels can quickly and accurately identify the high- and low-Cd accumulating rice cultivars.
水稻籽粒中镉(Cd)的过量积累威胁食品安全和人类健康。种植低镉积累水稻品种是生产低镉水稻的有效途径。然而,低镉水稻品种的田间筛选费力、耗时,且受环境×基因型互作的影响。在本研究中,我们调查了基于机器学习的方法结合基因型和土壤镉浓度是否能够识别高镉和低镉积累水稻品种。167个当地适应的高产水稻品种种植在三个土壤镉水平不同的田块中,并使用与籽粒镉积累相关的四个分子标记进行基因分型。我们鉴定出16个品种为稳定的低镉积累品种,其籽粒镉浓度在所有三个稻田中均低于0.2毫克/千克的食品安全限值。此外,我们开发了八个基于机器学习的模型,以基因型和土壤镉水平作为输入数据来预测低镉和高镉积累水稻品种。优化后的模型对低镉或高镉品种(即籽粒镉浓度低于或高于0.2毫克/千克)进行分类,总体准确率为76%。这些结果表明,利用分子标记和土壤镉水平构建的基于机器学习的分类模型能够快速、准确地识别高镉和低镉积累水稻品种。