Cho Eunjin, Cho Sunghyun, Kim Minjun, Ediriweera Thisarani Kalhari, Seo Dongwon, Lee Seung-Sook, Cha Jihye, Jin Daehyeok, Kim Young-Kuk, Lee Jun Heon
Department of Bio-AI Convergence, Chungnam National University, Daejeon 34134, Korea.
Research and Development Center, Insilicogen Inc., Yongin 19654, Korea.
J Anim Sci Technol. 2022 Sep;64(5):830-841. doi: 10.5187/jast.2022.e64. Epub 2022 Sep 30.
Genetic analysis has great potential as a tool to differentiate between different species and breeds of livestock. In this study, the optimal combinations of single nucleotide polymorphism (SNP) markers for discriminating the Yeonsan Ogye chicken () breed were identified using high-density 600K SNP array data. In 3,904 individuals from 198 chicken breeds, SNP markers specific to the target population were discovered through a case-control genome-wide association study (GWAS) and filtered out based on the linkage disequilibrium blocks. Significant SNP markers were selected by feature selection applying two machine learning algorithms: Random Forest (RF) and AdaBoost (AB). Using a machine learning approach, the 38 (RF) and 43 (AB) optimal SNP marker combinations for the Yeonsan Ogye chicken population demonstrated 100% accuracy. Hence, the GWAS and machine learning models used in this study can be efficiently utilized to identify the optimal combination of markers for discriminating target populations using multiple SNP markers.
遗传分析作为区分不同家畜物种和品种的工具具有巨大潜力。在本研究中,利用高密度600K SNP芯片数据确定了用于鉴别岭南乌骨鸡品种的单核苷酸多态性(SNP)标记的最佳组合。在来自198个鸡品种的3904只个体中,通过病例对照全基因组关联研究(GWAS)发现了目标群体特有的SNP标记,并基于连锁不平衡块将其过滤掉。应用随机森林(RF)和自适应增强(AB)两种机器学习算法进行特征选择,挑选出了显著的SNP标记。采用机器学习方法,针对岭南乌骨鸡群体的38个(RF)和43个(AB)最佳SNP标记组合显示出100%的准确率。因此,本研究中使用的GWAS和机器学习模型可有效地用于利用多个SNP标记识别用于区分目标群体的标记最佳组合。