Suppr超能文献

利用标记处基因含量的遗传力估计值对基因型进行质量控制。

Quality control of genotypes using heritability estimates of gene content at the marker.

作者信息

Forneris Natalia S, Legarra Andres, Vitezica Zulma G, Tsuruta Shogo, Aguilar Ignacio, Misztal Ignacy, Cantet Rodolfo J C

机构信息

Departamento de Producción Animal, Facultad de Agronomía, Universidad de Buenos Aires, C1417DSE Buenos Aires, Argentina Consejo Nacional de Investigaciones Científicas y Técnicas, Av. Rivadavia 1917, C1033AAJ Buenos Aires, Argentina.

INRA, Génétique, Physiologie et Systèmes d'Elevage (GenPhySE), F-31326 Castanet-Tolosan, France Université de Toulouse, INP, ENSAT, Génétique, Physiologie et Systèmes d'Elevage (GenPhySE), F-31326 Castanet-Tolosan, France

出版信息

Genetics. 2015 Mar;199(3):675-81. doi: 10.1534/genetics.114.173559. Epub 2015 Jan 6.

Abstract

Quality control filtering of single-nucleotide polymorphisms (SNPs) is a key step when analyzing genomic data. Here we present a practical method to identify low-quality SNPs, meaning markers whose genotypes are wrongly assigned for a large proportion of individuals, by estimating the heritability of gene content at each marker, where gene content is the number of copies of a particular reference allele in a genotype of an animal (0, 1, or 2). If there is no mutation at the marker, gene content has an additive heritability of 1 by construction. The method uses restricted maximum likelihood (REML) to estimate heritability of gene content at each SNP and also builds a likelihood-ratio test statistic to test for zero error variance in genotyping. As a by-product, estimates of the allele frequencies of markers at the base population are obtained. Using simulated data with 10% permutation error (4% actual error) in genotyping, the method had a specificity of 0.96 (4% of correct markers are rejected) and a sensitivity of 0.99 (1% of wrong markers are accepted) if markers with heritability lower than 0.975 are discarded. Checking of Mendelian errors resulted in a lower sensitivity (0.84) for the same simulation. The proposed method is further illustrated with a real data set with genotypes from 3534 animals genotyped for 50,433 markers from the Illumina PorcineSNP60 chip and a pedigree of 6473 individuals; those markers underwent very little quality control. A total of 4099 markers with P-values lower than 0.01 were discarded based on our method, with associated estimates of heritability as low as 0.12. Contrary to other techniques, our method uses all information in the population simultaneously, can be used in any population with markers and pedigree recordings, and is simple to implement using standard software for REML estimation. Scripts for its use are provided.

摘要

单核苷酸多态性(SNP)的质量控制筛选是分析基因组数据时的关键步骤。在此,我们提出一种实用方法来识别低质量SNP,即那些在很大比例个体中基因型被错误分配的标记,通过估计每个标记处基因含量的遗传力来实现,其中基因含量是动物基因型中特定参考等位基因的拷贝数(0、1或2)。如果标记处没有突变,基因含量通过构建具有加性遗传力1。该方法使用限制最大似然法(REML)来估计每个SNP处基因含量的遗传力,还构建了一个似然比检验统计量来检验基因分型中的零误差方差。作为副产品,可获得基础群体中标记等位基因频率的估计值。使用在基因分型中具有10%置换误差(4%实际误差)的模拟数据,如果丢弃遗传力低于0.975的标记,该方法的特异性为0.96(4%的正确标记被拒绝),灵敏度为0.99(1%的错误标记被接受)。对孟德尔误差的检查在相同模拟中导致较低的灵敏度(0.84)。用一个真实数据集进一步说明了所提出的方法,该数据集包含来自Illumina猪SNP60芯片的50433个标记且基因型为3534只动物的数据以及一个6473个个体的系谱;那些标记几乎没有经过质量控制。基于我们的方法,总共丢弃了4099个P值低于0.01的标记,其相关遗传力估计值低至0.12。与其他技术不同,我们的方法同时使用群体中的所有信息,可用于任何有标记和系谱记录的群体,并且使用用于REML估计的标准软件易于实现。提供了使用该方法的脚本。

相似文献

5
Estimation of heritability with genomic information by method R.利用方法 R 从基因组信息估算遗传力。
J Anim Breed Genet. 2024 Sep;141(5):550-558. doi: 10.1111/jbg.12863. Epub 2024 Mar 25.
6
Efficient and accurate computation of base generation allele frequencies.高效准确地计算碱基生成等位基因频率。
J Dairy Sci. 2019 Feb;102(2):1364-1373. doi: 10.3168/jds.2018-15264. Epub 2018 Nov 22.
7
Accuracy of genotype imputation in Nelore cattle.内洛尔牛基因型填充的准确性。
Genet Sel Evol. 2014 Oct 10;46(1):69. doi: 10.1186/s12711-014-0069-1.
8
SNP-based heritability estimation using a Bayesian approach.基于 SNP 的贝叶斯方法遗传力估计。
Animal. 2013 Apr;7(4):531-9. doi: 10.1017/S1751731112002017. Epub 2012 Nov 23.

引用本文的文献

10
Walking through the statistical black boxes of plant breeding.穿越植物育种的统计黑箱。
Theor Appl Genet. 2016 Oct;129(10):1933-49. doi: 10.1007/s00122-016-2750-y. Epub 2016 Jul 19.

本文引用的文献

1
VARIANCE OF GENE FREQUENCIES.基因频率的方差
Evolution. 1969 Mar;23(1):72-84. doi: 10.1111/j.1558-5646.1969.tb03496.x.
3
Detection of Mendelian consistent genotyping errors in pedigrees.家系中孟德尔一致基因分型错误的检测。
Genet Epidemiol. 2014 May;38(4):291-9. doi: 10.1002/gepi.21806. Epub 2014 Apr 9.
6

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验