Yang Yang, Tian Hongli, Wang Rui, Wang Lu, Yi Hongmei, Liu Yawei, Xu Liwen, Fan Yaming, Zhao Jiuran, Wang Fengge
Maize Research Center, Beijing Academy of Agriculture and Forestry Sciences (BAAFS), Beijing Key Laboratory of Maize DNA Fingerprinting and Molecular Breeding, Beijing, China.
Front Plant Sci. 2021 Mar 18;12:566796. doi: 10.3389/fpls.2021.566796. eCollection 2021.
Molecular marker technology is used widely in plant variety discrimination, molecular breeding, and other fields. To lower the cost of testing and improve the efficiency of data analysis, molecular marker screening is very important. Screening usually involves two phases: the first to control loci quality and the second to reduce loci quantity. To reduce loci quantity, an appraisal index that is very sensitive to a specific scenario is necessary to select loci combinations. In this study, we focused on loci combination screening for plant variety discrimination. A loci combination appraisal index, variety discrimination power (VDP), is proposed, and three statistical methods, probability-based VDP (P-VDP), comparison-based VDP (C-VDP), and ratio-based VDP (R-VDP), are described and compared. The results using the simulated data showed that VDP was sensitive to statistical populations with convergence toward the same variety, and the total probability of discrimination power (TDP) method was effective only for partial populations. R-VDP was more sensitive to statistical populations with convergence toward various varieties than P-VDP and C-VDP, which both had the same sensitivity; TDP was not sensitive at all. With the real data, R-VDP values for sorghum, wheat, maize and rice data begin to show downward tendency when the number of loci is 20, 7, 100, 100 respectively, while in the case of P-VDP and C-VDP (which have the same results), the number is 6, 4, 9, 19 respectively and in the case of TDP, the number is 6, 4, 4, 11 respectively. For the variety threshold setting, R-VDP values of loci combinations with different numbers of loci responded evenly to different thresholds. C-VDP values responded unevenly to different thresholds, and the extent of the response increased as the number of loci decreased. All the methods gave underestimations when data were missing, with systematic errors for TDP, C-VDP, and R-VDP going from smallest to biggest. We concluded that VDP was a better loci combination appraisal index than TDP for plant variety discrimination and the three VDP methods have different applications. We developed the software called VDPtools, which can calculate the values of TDP, P-VDP, C-VDP, and R-VDP. VDPtools is publicly available at https://github.com/caurwx1/VDPtools.git.
分子标记技术在植物品种鉴定、分子育种等领域有着广泛应用。为降低检测成本并提高数据分析效率,分子标记筛选至关重要。筛选通常包括两个阶段:第一阶段控制位点质量,第二阶段减少位点数量。为减少位点数量,需要一个对特定场景非常敏感的评估指标来选择位点组合。在本研究中,我们聚焦于植物品种鉴定的位点组合筛选。提出了一个位点组合评估指标——品种鉴别力(VDP),并描述和比较了三种统计方法,即基于概率的VDP(P-VDP)、基于比较的VDP(C-VDP)和基于比率的VDP(R-VDP)。使用模拟数据的结果表明,VDP对趋向同一品种的统计群体敏感,鉴别力总概率(TDP)方法仅对部分群体有效。R-VDP对趋向不同品种的统计群体比P-VDP和C-VDP更敏感,P-VDP和C-VDP的敏感性相同;TDP则完全不敏感。对于实际数据,高粱、小麦、玉米和水稻数据的R-VDP值在位点数量分别为20、7、100、100时开始呈现下降趋势,而对于P-VDP和C-VDP(结果相同),该数量分别为6、4、9、19,对于TDP,该数量分别为6、4、4、11。对于品种阈值设定,不同位点数量的位点组合的R-VDP值对不同阈值的响应较为均匀。C-VDP值对不同阈值的响应不均匀,且响应程度随着位点数量的减少而增加。当数据缺失时,所有方法都会低估结果,TDP、C-VDP和R-VDP的系统误差从最小到最大。我们得出结论,对于植物品种鉴定,VDP是比TDP更好的位点组合评估指标,并且三种VDP方法有不同的应用。我们开发了名为VDPtools的软件,它可以计算TDP、P-VDP、C-VDP和R-VDP的值。VDPtools可在https://github.com/caurwx1/VDPtools.git上公开获取。