Abbasi Holasou Hossein, Panahi Bahman, Shahi Ali, Nami Yousef
Department of Plant Breeding and Biotechnology, Faculty of Agriculture, University of Tabriz, Tabriz, Iran.
Department of Genomics, Branch for Northwest and West Region, Agricultural Biotechnology Research Institute of Iran (ABRII), Agricultural Research, Education and Extension Organization (AREEO), Tabriz, Iran.
Biochem Biophys Rep. 2024 Mar 10;38:101678. doi: 10.1016/j.bbrep.2024.101678. eCollection 2024 Jul.
Development of efficient analytical techniques is required for effective interpretation of biological data to take novel hypotheses and finding the critical predictive patterns. Machine Learning algorithms provide a novel opportunity for development of low-cost and practical solutions in biology. In this study, we proposed a new integrated analytical approach using supervised machine learning algorithms and microsatellites data of worldwide vitis populations. A total of 1378 wild ( spp. ) and cultivated ( spp. ) accessions of grapevine were investigated using 20 microsatellite markers. Data cleaning, feature selection, and supervised machine learning classification models vis, Naive Bayes, Support Vector Machine (SVM) and Tree Induction methods were implied to find most indicative and diagnostic alleles to represent wild/cultivated and originated geography of each population. Our combined approaches showed microsatellite markers with the highest differentiating capacity and proved efficiency for our pipeline of classification and prediction of vitis accessions. Moreover, our study proposed the best combination of markers for better distinguishing of populations, which can be exploited in future germplasm conservation and breeding programs.
为了有效地解释生物数据以提出新的假设并找到关键的预测模式,需要开发高效的分析技术。机器学习算法为生物学中低成本实用解决方案的开发提供了新的机会。在本研究中,我们提出了一种新的综合分析方法,该方法使用监督机器学习算法和全球葡萄种群的微卫星数据。使用20个微卫星标记对总共1378份野生(种)和栽培(种)葡萄品种进行了研究。采用数据清理、特征选择和监督机器学习分类模型,即朴素贝叶斯、支持向量机(SVM)和树归纳方法,来寻找最具指示性和诊断性的等位基因,以代表每个种群的野生/栽培和起源地理。我们的组合方法显示了具有最高区分能力的微卫星标记,并证明了我们对葡萄品种进行分类和预测流程的效率。此外,我们的研究提出了用于更好地区分种群的标记的最佳组合,可在未来的种质保护和育种计划中加以利用。
Front Plant Sci. 2020-2-28
BMC Plant Biol. 2013-10-4
Recent Pat Biotechnol. 2018
Front Plant Sci. 2023-12-8
Genes (Basel). 2023-3-7
Science. 2023-3-3