• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用最少数量的单核苷酸多态性通过机器学习模型识别目标鸡群

Identification of Target Chicken Populations by Machine Learning Models Using the Minimum Number of SNPs.

作者信息

Seo Dongwon, Cho Sunghyun, Manjula Prabuddha, Choi Nuri, Kim Young-Kuk, Koh Yeong Jun, Lee Seung Hwan, Kim Hyung-Yong, Lee Jun Heon

机构信息

Division of Animal and Dairy Science, Chungnam National University, Daejeon 34134, Korea.

Bio-AI Convergence Research Center, Chungnam National University, Daejeon 34134, Korea.

出版信息

Animals (Basel). 2021 Jan 19;11(1):241. doi: 10.3390/ani11010241.

DOI:10.3390/ani11010241
PMID:33477975
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7835996/
Abstract

A marker combination capable of classifying a specific chicken population could improve commercial value by increasing consumer confidence with respect to the origin of the population. This would facilitate the protection of native genetic resources in the market of each country. In this study, a total of 283 samples from 20 lines, which consisted of Korean native chickens, commercial native chickens, and commercial broilers with a layer population, were analyzed to determine the optimal marker combination comprising the minimum number of markers, using a 600 k high-density single nucleotide polymorphism (SNP) array. Machine learning algorithms, a genome-wide association study (GWAS), linkage disequilibrium (LD) analysis, and principal component analysis (PCA) were used to distinguish a target (case) group for comparison with control chicken groups. In the processing of marker selection, a total of 47,303 SNPs were used for classifying chicken populations; 96 LD-pruned SNPs (50 SNPs per LD block) served as the best marker combination for target chicken classification. Moreover, 36, 44, and 8 SNPs were selected as the minimum numbers of markers by the AdaBoost (AB), Random Forest (RF), and Decision Tree (DT) machine learning classification models, which had accuracy rates of 99.6%, 98.0%, and 97.9%, respectively. The selected marker combinations increased the genetic distance and fixation index (Fst) values between the case and control groups, and they reduced the number of genetic components required, confirming that efficient classification of the groups was possible by using a small number of marker sets. In a verification study including additional chicken breeds and samples (12 lines and 182 samples), the accuracy did not significantly change, and the target chicken group could be clearly distinguished from the other populations. The GWAS, PCA, and machine learning algorithms used in this study can be applied efficiently, to determine the optimal marker combination with the minimum number of markers that can distinguish the target population among a large number of SNP markers.

摘要

一种能够对特定鸡群进行分类的标记组合,可通过增强消费者对鸡群来源的信心来提高商业价值。这将有助于在各国市场中保护本地遗传资源。在本研究中,使用600k高密度单核苷酸多态性(SNP)芯片,对来自20个品系的总共283个样本进行了分析,这些样本包括韩国本土鸡、商业本土鸡以及带有蛋鸡群体的商业肉鸡,以确定包含最少标记数量的最佳标记组合。机器学习算法、全基因组关联研究(GWAS)、连锁不平衡(LD)分析和主成分分析(PCA)被用于区分目标(病例)组,以便与对照鸡群进行比较。在标记选择过程中,总共47303个SNP用于鸡群分类;96个经LD修剪的SNP(每个LD块50个SNP)作为目标鸡分类的最佳标记组合。此外,通过AdaBoost(AB)、随机森林(RF)和决策树(DT)机器学习分类模型分别选择了36、44和8个SNP作为最少标记数量,其准确率分别为99.6%、98.0%和97.9%。所选的标记组合增加了病例组和对照组之间的遗传距离和固定指数(Fst)值,并减少了所需的遗传成分数量,证实了使用少量标记集能够有效地对群体进行分类。在一项包括额外鸡品种和样本(12个品系和182个样本)的验证研究中,准确率没有显著变化,并且目标鸡群能够与其他群体清楚地区分开来。本研究中使用的GWAS、PCA和机器学习算法可以有效地应用,以确定能够在大量SNP标记中区分目标群体的最少标记数量的最佳标记组合。

相似文献

1
Identification of Target Chicken Populations by Machine Learning Models Using the Minimum Number of SNPs.使用最少数量的单核苷酸多态性通过机器学习模型识别目标鸡群
Animals (Basel). 2021 Jan 19;11(1):241. doi: 10.3390/ani11010241.
2
Single nucleotide polymorphism marker combinations for classifying Yeonsan Ogye chicken using a machine learning approach.使用机器学习方法对延山五黑鸡进行分类的单核苷酸多态性标记组合
J Anim Sci Technol. 2022 Sep;64(5):830-841. doi: 10.5187/jast.2022.e64. Epub 2022 Sep 30.
3
Extent and consistency of linkage disequilibrium and identification of DNA markers for production and egg quality traits in commercial layer chicken populations.商业蛋鸡群体中连锁不平衡的程度和一致性以及产蛋性能和蛋品质性状的DNA标记鉴定
BMC Genomics. 2009 Jul 14;10 Suppl 2(Suppl 2):S2. doi: 10.1186/1471-2164-10-S2-S2.
4
Development of a high density 600K SNP genotyping array for chicken.开发一种用于鸡的高密度 600K SNP 基因分型芯片。
BMC Genomics. 2013 Jan 28;14:59. doi: 10.1186/1471-2164-14-59.
5
Assessment of linkage disequilibrium patterns between structural variants and single nucleotide polymorphisms in three commercial chicken populations.评估三个商业鸡群中结构变异与单核苷酸多态性之间的连锁不平衡模式。
BMC Genomics. 2022 Mar 9;23(1):193. doi: 10.1186/s12864-022-08418-7.
6
A new chicken 55K SNP genotyping array.一种新的鸡 55K SNP 基因分型芯片。
BMC Genomics. 2019 May 22;20(1):410. doi: 10.1186/s12864-019-5736-8.
7
The development and characterization of a 60K SNP chip for chicken.鸡的 60K SNP 芯片的开发与特性分析。
BMC Genomics. 2011 May 31;12(1):274. doi: 10.1186/1471-2164-12-274.
8
Linkage disequilibrium reveals different demographic history in egg laying chickens.连锁不平衡揭示了产卵鸡不同的种群历史。
BMC Genet. 2010 Nov 15;11:103. doi: 10.1186/1471-2156-11-103.
9
Estimation of linkage disequilibrium and analysis of genetic diversity in Korean chicken lines.韩国鸡品系中连锁不平衡的估计及遗传多样性分析。
PLoS One. 2018 Feb 9;13(2):e0192063. doi: 10.1371/journal.pone.0192063. eCollection 2018.
10
Efficiency of different strategies to mitigate ascertainment bias when using SNP panels in diversity studies.利用 SNP 面板进行多样性研究时,不同策略减轻检出偏差的效率。
BMC Genomics. 2018 Jan 5;19(1):22. doi: 10.1186/s12864-017-4416-9.

引用本文的文献

1
Comprehensive duck DNA fingerprinting based on machine learning for breed identification.基于机器学习的综合鸭DNA指纹识别用于品种鉴定。
Poult Sci. 2025 May 29;104(8):105359. doi: 10.1016/j.psj.2025.105359.
2
rPIMS: a ShinyR package for the precision identification and modelling of livestock breeds using genomic data and machine learning approaches.rPIMS:一个用于利用基因组数据和机器学习方法对家畜品种进行精准识别和建模的ShinyR软件包。
Bioinform Adv. 2025 Apr 7;5(1):vbaf077. doi: 10.1093/bioadv/vbaf077. eCollection 2025.
3
Functional Polymorphisms in the Neuropeptide Y (NPY) Gene Associated with Egg Production in Thai Native, Black-Bone, and Commercial Laying Hens Using SNP Markers.

本文引用的文献

1
Genome-wide prediction for complex traits under the presence of dominance effects in simulated populations using GBLUP and machine learning methods.使用 GBLUP 和机器学习方法在模拟群体中存在显性效应的情况下对复杂性状进行全基因组预测。
J Anim Sci. 2020 Jun 1;98(6). doi: 10.1093/jas/skaa179.
2
Discovery of significant porcine SNPs for swine breed identification by a hybrid of information gain, genetic algorithm, and frequency feature selection technique.利用信息增益、遗传算法和频率特征选择技术的混合方法发现用于猪种鉴定的显著猪 SNP。
BMC Bioinformatics. 2020 May 26;21(1):216. doi: 10.1186/s12859-020-3471-4.
3
A machine learning approach for the identification of population-informative markers from high-throughput genotyping data: application to several pig breeds.
利用单核苷酸多态性(SNP)标记研究泰国本土鸡、乌鸡和商业蛋鸡中与产蛋相关的神经肽Y(NPY)基因功能多态性
Animals (Basel). 2025 Mar 5;15(5):744. doi: 10.3390/ani15050744.
4
Identification of Taihang-chicken-specific genetic markers using genome-wide SNPs and machine learning: BREED-SPECIFIC SNPS OF TAIHANG CHICKEN.利用全基因组单核苷酸多态性和机器学习鉴定太行鸡特异性遗传标记:太行鸡的品种特异性单核苷酸多态性
Poult Sci. 2025 Jan;104(1):104585. doi: 10.1016/j.psj.2024.104585. Epub 2024 Nov 22.
5
Identifying low-density, ancestry-informative SNP markers through whole genome resequencing in Indian, Chinese, and wild yak.通过对印度、中国和野牦牛的全基因组重测序,鉴定出低密度、具有祖先信息的 SNP 标记。
BMC Genomics. 2024 Nov 5;25(1):1043. doi: 10.1186/s12864-024-10924-9.
6
A web tool for the global identification of pig breeds.一个用于全球猪品种识别的网络工具。
Genet Sel Evol. 2023 Mar 21;55(1):18. doi: 10.1186/s12711-023-00788-0.
7
Screening Discriminating SNPs for Chinese Indigenous Pig Breeds Identification Using a Random Forests Algorithm.利用随机森林算法筛选用于中国本土猪种鉴定的区分 SNP。
Genes (Basel). 2022 Nov 25;13(12):2207. doi: 10.3390/genes13122207.
8
Single nucleotide polymorphism marker combinations for classifying Yeonsan Ogye chicken using a machine learning approach.使用机器学习方法对延山五黑鸡进行分类的单核苷酸多态性标记组合
J Anim Sci Technol. 2022 Sep;64(5):830-841. doi: 10.5187/jast.2022.e64. Epub 2022 Sep 30.
9
Identification of Ancestry Informative Markers in Mediterranean Trout Populations of Molise (Italy): A Multi-Methodological Approach with Machine Learning.地中海鳟鱼在意大利莫利塞地区种群中的遗传标记鉴定:一种具有机器学习的多方法学方法。
Genes (Basel). 2022 Jul 28;13(8):1351. doi: 10.3390/genes13081351.
10
Feature Fusion and Detection in Alzheimer's Disease Using a Novel Genetic Multi-Kernel SVM Based on MRI Imaging and Gene Data.基于 MRI 成像和基因数据的新型遗传多核 SVM 在阿尔茨海默病中的特征融合与检测。
Genes (Basel). 2022 May 7;13(5):837. doi: 10.3390/genes13050837.
一种从高通量基因分型数据中识别群体信息标记的机器学习方法:在多个猪品种中的应用。
Animal. 2020 Feb;14(2):223-232. doi: 10.1017/S1751731119002167. Epub 2019 Oct 11.
4
Comparative analysis of five different methods to design a breed-specific SNP panel for cattle.五种不同方法设计牛种特异性 SNP 面板的比较分析。
Anim Biotechnol. 2021 Feb;32(1):130-136. doi: 10.1080/10495398.2019.1646266. Epub 2019 Jul 31.
5
A Guide for Using Deep Learning for Complex Trait Genomic Prediction.深度学习在复杂性状基因组预测中的应用指南。
Genes (Basel). 2019 Jul 20;10(7):553. doi: 10.3390/genes10070553.
6
Clinical utility of the polygenic LDL-C SNP score in familial hypercholesterolemia.载脂蛋白 LDL-C 多基因 SNP 评分在家族性高胆固醇血症中的临床应用。
Atherosclerosis. 2018 Oct;277:457-463. doi: 10.1016/j.atherosclerosis.2018.06.006.
7
Genetic and genomic monitoring with minimally invasive sampling methods.采用微创采样方法进行遗传和基因组监测。
Evol Appl. 2018 Mar 24;11(7):1094-1119. doi: 10.1111/eva.12600. eCollection 2018 Aug.
8
Estimation of linkage disequilibrium and analysis of genetic diversity in Korean chicken lines.韩国鸡品系中连锁不平衡的估计及遗传多样性分析。
PLoS One. 2018 Feb 9;13(2):e0192063. doi: 10.1371/journal.pone.0192063. eCollection 2018.
9
The use of genetic markers to estimate relationships between dogs in the course of criminal investigations.在刑事调查过程中使用基因标记来估计狗之间的亲缘关系。
BMC Res Notes. 2017 Aug 17;10(1):414. doi: 10.1186/s13104-017-2722-6.
10
Preselection statistics and Random Forest classification identify population informative single nucleotide polymorphisms in cosmopolitan and autochthonous cattle breeds.预选统计和随机森林分类确定了世界性和土着牛品种中具有群体信息的单核苷酸多态性。
Animal. 2018 Jan;12(1):12-19. doi: 10.1017/S1751731117001355. Epub 2017 Jun 23.