利用主成分分析亲缘关系信息 SNP 追踪牛品种。

Tracing cattle breeds with principal components analysis ancestry informative SNPs.

机构信息

Department of Computer Science, Rensselaer Polytechnic Institute, Troy, New York, United States of America.

出版信息

PLoS One. 2011 Apr 7;6(4):e18007. doi: 10.1371/journal.pone.0018007.

DOI:10.1371/journal.pone.0018007

PMID:21490966

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3072384/

Abstract

The recent release of the Bovine HapMap dataset represents the most detailed survey of bovine genetic diversity to date, providing an important resource for the design and development of livestock production. We studied this dataset, comprising more than 30,000 Single Nucleotide Polymorphisms (SNPs) for 19 breeds (13 taurine, three zebu, and three hybrid breeds), seeking to identify small panels of genetic markers that can be used to trace the breed of unknown cattle samples. Taking advantage of the power of Principal Components Analysis and algorithms that we have recently described for the selection of Ancestry Informative Markers from genomewide datasets, we present a decision-tree which can be used to accurately infer the origin of individual cattle. In doing so, we present a thorough examination of population genetic structure in modern bovine breeds. Performing extensive cross-validation experiments, we demonstrate that 250-500 carefully selected SNPs suffice in order to achieve close to 100% prediction accuracy of individual ancestry, when this particular set of 19 breeds is considered. Our methods, coupled with the dense genotypic data that is becoming increasingly available, have the potential to become a valuable tool and have considerable impact in worldwide livestock production. They can be used to inform the design of studies of the genetic basis of economically important traits in cattle, as well as breeding programs and efforts to conserve biodiversity. Furthermore, the SNPs that we have identified can provide a reliable solution for the traceability of breed-specific branded products.

摘要

最近发布的牛基因组单核苷酸多态性图谱数据集代表了迄今为止对牛遗传多样性的最详细调查，为设计和开发家畜生产提供了重要资源。我们研究了这个数据集，其中包含了 19 个品种（13 个瘤牛、3 个泽布牛和 3 个杂交品种）的 30000 多个单核苷酸多态性（SNP），旨在确定可以用来追踪未知牛样品品种的小型遗传标记面板。利用主成分分析的强大功能和我们最近描述的从全基因组数据集选择祖先信息标记的算法，我们提出了一个决策树，可以用来准确推断个体牛的起源。通过这样做，我们对现代牛品种的群体遗传结构进行了彻底的检查。通过广泛的交叉验证实验，我们证明，在考虑这 19 个特定品种时，只需精心选择 250-500 个 SNP 就足以实现个体祖先预测准确率接近 100%。我们的方法，加上越来越多的密集基因型数据，有可能成为一个有价值的工具，并对全球畜牧业生产产生重大影响。它们可以用于告知有关牛的经济重要性状的遗传基础研究、育种计划和保护生物多样性的设计。此外，我们确定的 SNP 可以为特定品种品牌产品的可追溯性提供可靠的解决方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3344/3072384/6608c7516555/pone.0018007.g001.jpg

相似文献

Tracing cattle breeds with principal components analysis ancestry informative SNPs.利用主成分分析亲缘关系信息 SNP 追踪牛品种。

PLoS One. 2011 Apr 7;6(4):e18007. doi: 10.1371/journal.pone.0018007.

Preselection statistics and Random Forest classification identify population informative single nucleotide polymorphisms in cosmopolitan and autochthonous cattle breeds.预选统计和随机森林分类确定了世界性和土着牛品种中具有群体信息的单核苷酸多态性。

Animal. 2018 Jan;12(1):12-19. doi: 10.1017/S1751731117001355. Epub 2017 Jun 23.

Combined use of principal component analysis and random forests identify population-informative single nucleotide polymorphisms: application in cattle breeds.主成分分析与随机森林的联合使用可识别群体信息单核苷酸多态性：在牛品种中的应用

J Anim Breed Genet. 2015 Oct;132(5):346-56. doi: 10.1111/jbg.12155. Epub 2015 Mar 17.

The use of SNP markers for cattle breed identification.利用 SNP 标记进行牛种识别。

J Appl Genet. 2024 Sep;65(3):575-589. doi: 10.1007/s13353-024-00857-0. Epub 2024 Apr 3.

Genetic diversity, population structure, and correlations between locally adapted zebu and taurine breeds in Brazil using SNP markers.利用单核苷酸多态性（SNP）标记研究巴西本地适应性瘤牛和普通牛品种的遗传多样性、群体结构及相关性

Trop Anim Health Prod. 2017 Dec;49(8):1677-1684. doi: 10.1007/s11250-017-1376-7. Epub 2017 Aug 15.

Genetic diversity of BoLA-DRB3 in South American Zebu cattle populations.南美瘤牛群体中 BoLA-DRB3 的遗传多样性。

BMC Genet. 2018 May 22;19(1):33. doi: 10.1186/s12863-018-0618-7.

Whole genome characterization of autochthonous Bos taurus brachyceros and introduced Bos indicus indicus cattle breeds in Cameroon regarding their adaptive phenotypic traits and pathogen resistance.喀麦隆本土短角牛和引入的印度瘤牛牛种的全基因组特征，涉及它们的适应性表型特征和对病原体的抗性。

BMC Genet. 2020 Jun 22;21(1):64. doi: 10.1186/s12863-020-00869-9.

A machine learning approach for the identification of population-informative markers from high-throughput genotyping data: application to several pig breeds.一种从高通量基因分型数据中识别群体信息标记的机器学习方法：在多个猪品种中的应用。

Animal. 2020 Feb;14(2):223-232. doi: 10.1017/S1751731119002167. Epub 2019 Oct 11.

High-resolution haplotype block structure in the cattle genome.牛基因组中的高分辨率单倍型块结构。

BMC Genet. 2009 Apr 24;10:19. doi: 10.1186/1471-2156-10-19.

Genetic diversity and relationships among six local cattle populations in semi-arid areas assessed by a bovine medium-density single nucleotide polymorphism data.利用牛中密度单核苷酸多态性数据评估半干旱地区 6 个本地牛种群的遗传多样性和关系。

Animal. 2019 Jan;13(1):8-14. doi: 10.1017/S1751731118001179. Epub 2018 Jun 18.

引用本文的文献

RASEL: An Ensemble Model for Selection of Core SNPs and Its Application for Identification and Classification of Cattle Breeds.RASEL：一种用于选择核心单核苷酸多态性的集成模型及其在牛品种鉴定和分类中的应用

Biochem Genet. 2025 Aug 22. doi: 10.1007/s10528-025-11230-z.

A deep learning strategy for accurate identification of purebred and hybrid pigs across SNP chips.一种基于SNP芯片准确识别纯种猪和杂交猪的深度学习策略。

J Anim Sci Biotechnol. 2025 Aug 14;16(1):116. doi: 10.1186/s40104-025-01249-y.

Global and Local Ancestry and its Importance: A Review.全球和本地血统及其重要性：综述

Curr Genomics. 2024;25(4):237-260. doi: 10.2174/0113892029298909240426094055. Epub 2024 May 9.

A Novel Insight into the Identification of Potential SNP Markers for the Genomic Characterization of Buffalo Breeds in Pakistan.对巴基斯坦水牛品种基因组特征潜在SNP标记鉴定的新见解。

Animals (Basel). 2023 Aug 7;13(15):2543. doi: 10.3390/ani13152543.

Detecting SNP markers discriminating horse breeds by deep learning.利用深度学习技术检测区分马品种的 SNP 标记。

Sci Rep. 2023 Jul 18;13(1):11592. doi: 10.1038/s41598-023-38601-z.

Evaluating the use of statistical and machine learning methods for estimating breed composition of purebred and crossbred animals in thirteen cattle breeds using genomic information.利用基因组信息评估统计和机器学习方法在13个牛品种中估计纯种和杂交动物品种组成的应用。

Front Genet. 2023 May 15;14:1120312. doi: 10.3389/fgene.2023.1120312. eCollection 2023.

Screening Discriminating SNPs for Chinese Indigenous Pig Breeds Identification Using a Random Forests Algorithm.利用随机森林算法筛选用于中国本土猪种鉴定的区分 SNP。

Genes (Basel). 2022 Nov 25;13(12):2207. doi: 10.3390/genes13122207.

Identification of Ancestry Informative Markers in Mediterranean Trout Populations of Molise (Italy): A Multi-Methodological Approach with Machine Learning.地中海鳟鱼在意大利莫利塞地区种群中的遗传标记鉴定：一种具有机器学习的多方法学方法。

Genes (Basel). 2022 Jul 28;13(8):1351. doi: 10.3390/genes13081351.

Admixture and breed traceability in European indigenous pig breeds and wild boar using genome-wide SNP data.利用全基因组 SNP 数据进行欧洲本土猪品种和野猪的混合和品种溯源。

Sci Rep. 2022 May 5;12(1):7346. doi: 10.1038/s41598-022-10698-8.

Authoritative subspecies diagnosis tool for European honey bees based on ancestry informative SNPs.基于祖先信息单核苷酸多态性的欧洲蜜蜂权威亚种诊断工具。

BMC Genomics. 2021 Feb 3;22(1):101. doi: 10.1186/s12864-021-07379-7.

本文引用的文献

The genetical structure of populations.种群的遗传结构。

Ann Eugen. 1951 Mar;15(4):323-54. doi: 10.1111/j.1469-1809.1949.tb02451.x.

Traceability of four European Protected Geographic Indication (PGI) beef products using Single Nucleotide Polymorphisms (SNP) and Bayesian statistics.使用单核苷酸多态性（SNP）和贝叶斯统计对四种欧洲受保护地理标志（PGI）牛肉产品进行可追溯性研究。

Meat Sci. 2008 Dec;80(4):1212-7. doi: 10.1016/j.meatsci.2008.05.021. Epub 2008 May 28.

Whole genome sequencing of a single Bos taurus animal for single nucleotide polymorphism discovery.对单个牛（Bos taurus）动物进行全基因组测序以发现单核苷酸多态性。

Genome Biol. 2009;10(8):R82. doi: 10.1186/gb-2009-10-8-r82. Epub 2009 Aug 6.

Genome-wide insights into the patterns and determinants of fine-scale population structure in humans.对人类精细尺度种群结构模式和决定因素的全基因组洞察。

Am J Hum Genet. 2009 May;84(5):641-50. doi: 10.1016/j.ajhg.2009.04.015.

Unlocking the bovine genome.解锁牛基因组。

BMC Genomics. 2009 Apr 24;10:193. doi: 10.1186/1471-2164-10-193.

High-resolution haplotype block structure in the cattle genome.牛基因组中的高分辨率单倍型块结构。

BMC Genet. 2009 Apr 24;10:19. doi: 10.1186/1471-2156-10-19.

A whole-genome assembly of the domestic cow, Bos taurus.家牛（Bos taurus）的全基因组组装。

Genome Biol. 2009;10(4):R42. doi: 10.1186/gb-2009-10-4-r42. Epub 2009 Apr 24.

Genome-wide survey of SNP variation uncovers the genetic structure of cattle breeds.全基因组单核苷酸多态性变异调查揭示了牛品种的遗传结构。

Science. 2009 Apr 24;324(5926):528-32. doi: 10.1126/science.1167936.

The genome sequence of taurine cattle: a window to ruminant biology and evolution.普通牛的基因组序列：反刍动物生物学与进化的一扇窗口。

Science. 2009 Apr 24;324(5926):522-8. doi: 10.1126/science.1169588.

Tracing sub-structure in the European American population with PCA-informative markers.使用主成分分析（PCA）信息性标记物追踪欧裔美国人种群中的亚结构。

PLoS Genet. 2008 Jul 4;4(7):e1000114. doi: 10.1371/journal.pgen.1000114.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用主成分分析亲缘关系信息 SNP 追踪牛品种。

Tracing cattle breeds with principal components analysis ancestry informative SNPs.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献