• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

机器学习算法利用基因组标记对中国本地牛品种进行分类准确性研究。

Classification accuracy of machine learning algorithms for Chinese local cattle breeds using genomic markers.

机构信息

College of Animal Science and Technology, China Agricultural University, Beijing 100193, China.

出版信息

Yi Chuan. 2024 Jul;46(7):530-539. doi: 10.16288/j.yczz.24-059.

DOI:10.16288/j.yczz.24-059
PMID:39016086
Abstract

Accurate breed classification is required for the conservation and utilization of farm animal genetic resources. Traditional classification methods mainly rely on phenotypic characterization. However, it is difficult to distinguish between the highly similar breeds due to the challenges in qualifying the phenotypic character. Machine learning algorithms show unique advantages in breed classification using genomic information. To evaluate the classification methods for Chinese cattle breeds, this study utilized genomic SNP data from 213 individuals across seven Chinese local breeds and compared the classification accuracies of three feature selection methods (F value sorting and screening, mRMR, and Relief-F) and three machine learning algorithms (Random Forest, Support Vector Machine, and Naive Bayes). Results showed that: 1) using the F method to screen more than 1500 SNPs, or using the mRMR algorithm to screen more than 1000 SNPs, the SVM classification algorithm can achieve more than 99.47% classification accuracy; 2) the most effective algorithm was SVM, followed by NB, while the best SNP selection method was F and mRMR, followed by Relief-F; 3) species misclassification often occurs between breeds with high similarity. This study demonstrates that machine learning classification models combined with genomic data are effective methods for the classification of local cattle breeds, providing a technical basis for the rapid and accurate classification of cattle breeds in China.

摘要

准确的品种分类对于保护和利用家畜遗传资源至关重要。传统的分类方法主要依赖于表型特征。然而,由于表型特征难以定性,高度相似的品种之间难以区分。利用基因组信息进行品种分类时,机器学习算法显示出独特的优势。为了评估中国牛品种的分类方法,本研究利用来自七个中国地方品种的 213 个个体的基因组 SNP 数据,比较了三种特征选择方法(F 值排序和筛选、mRMR 和 Relief-F)和三种机器学习算法(随机森林、支持向量机和朴素贝叶斯)的分类精度。结果表明:1)使用 F 方法筛选超过 1500 个 SNP,或使用 mRMR 算法筛选超过 1000 个 SNP,SVM 分类算法可达到 99.47%以上的分类精度;2)最有效的算法是 SVM,其次是 NB,而最佳 SNP 选择方法是 F 和 mRMR,其次是 Relief-F;3)高度相似的品种之间经常发生物种误分类。本研究表明,结合基因组数据的机器学习分类模型是地方牛品种分类的有效方法,为中国牛品种的快速准确分类提供了技术基础。

相似文献

1
Classification accuracy of machine learning algorithms for Chinese local cattle breeds using genomic markers.机器学习算法利用基因组标记对中国本地牛品种进行分类准确性研究。
Yi Chuan. 2024 Jul;46(7):530-539. doi: 10.16288/j.yczz.24-059.
2
The use of a genomic relationship matrix for breed assignment of cattle breeds: comparison and combination with a machine learning method.利用基因组关系矩阵对牛品种进行品种归属:与机器学习方法的比较和结合。
J Anim Sci. 2023 Jan 3;101. doi: 10.1093/jas/skad172.
3
Population structure and breed identification of Chinese indigenous sheep breeds using whole genome SNPs and InDels.利用全基因组 SNPs 和 InDels 对中国本土绵羊品种进行群体结构和品种鉴定。
Genet Sel Evol. 2024 Sep 3;56(1):60. doi: 10.1186/s12711-024-00927-1.
4
Identification of population-informative markers from high-density genotyping data through combined feature selection and machine learning algorithms: Application to European autochthonous and cosmopolitan pig breeds.通过组合特征选择和机器学习算法从高密度基因分型数据中识别群体信息标记:在欧洲本地和世界性猪品种中的应用。
Anim Genet. 2024 Apr;55(2):193-205. doi: 10.1111/age.13396. Epub 2024 Jan 8.
5
Evaluation of six machine learning classification algorithms in pig breed identification using SNPs array data.基于 SNP 芯片数据的六种机器学习分类算法在猪品种鉴定中的评估。
Anim Genet. 2023 Apr;54(2):113-122. doi: 10.1111/age.13279. Epub 2022 Dec 2.
6
Improving Genomic Predictions in Multi-Breed Cattle Populations: A Comparative Analysis of BayesR and GBLUP Models.提高多品种牛群体的基因组预测:贝叶斯 R 和 GBLUP 模型的比较分析。
Genes (Basel). 2024 Feb 18;15(2):253. doi: 10.3390/genes15020253.
7
Improving accuracy of genomic predictions within and between dairy cattle breeds with imputed high-density single nucleotide polymorphism panels.利用高分辨率单核苷酸多态性面板提高奶牛品种内和品种间基因组预测的准确性。
J Dairy Sci. 2012 Jul;95(7):4114-29. doi: 10.3168/jds.2011-5019.
8
A machine learning approach for the identification of population-informative markers from high-throughput genotyping data: application to several pig breeds.一种从高通量基因分型数据中识别群体信息标记的机器学习方法:在多个猪品种中的应用。
Animal. 2020 Feb;14(2):223-232. doi: 10.1017/S1751731119002167. Epub 2019 Oct 11.
9
Preselection statistics and Random Forest classification identify population informative single nucleotide polymorphisms in cosmopolitan and autochthonous cattle breeds.预选统计和随机森林分类确定了世界性和土着牛品种中具有群体信息的单核苷酸多态性。
Animal. 2018 Jan;12(1):12-19. doi: 10.1017/S1751731117001355. Epub 2017 Jun 23.
10
Using machine learning to realize genetic site screening and genomic prediction of productive traits in pigs.利用机器学习实现猪生产性状的遗传位点筛选和基因组预测。
FASEB J. 2023 Jun;37(6):e22961. doi: 10.1096/fj.202300245R.

引用本文的文献

1
Identification of Taihang-chicken-specific genetic markers using genome-wide SNPs and machine learning: BREED-SPECIFIC SNPS OF TAIHANG CHICKEN.利用全基因组单核苷酸多态性和机器学习鉴定太行鸡特异性遗传标记:太行鸡的品种特异性单核苷酸多态性
Poult Sci. 2025 Jan;104(1):104585. doi: 10.1016/j.psj.2024.104585. Epub 2024 Nov 22.