Suppr超能文献

韩牛全基因组序列填充策略的评估

Evaluation of Whole-Genome Sequence Imputation Strategies in Korean Hanwoo Cattle.

作者信息

Nawaz Muhammad Yasir, Bernardes Priscila Arrigucci, Savegnago Rodrigo Pelicioni, Lim Dajeong, Lee Seung Hwan, Gondro Cedric

机构信息

Genetics and Genome Sciences Graduate Program, Michigan State University, East Lansing, MI 48824, USA.

Department of Animal Science and Rural Development, Federal University of Santa Catarina, Florianopolis 88034-000, SC, Brazil.

出版信息

Animals (Basel). 2022 Sep 1;12(17):2265. doi: 10.3390/ani12172265.

Abstract

This study evaluated the accuracy of sequence imputation in Hanwoo beef cattle using different reference panels: a large multi-breed reference with no Hanwoo ( = 6269), a much smaller Hanwoo purebred reference ( = 88), and both datasets combined ( = 6357). The target animals were 136 cattle both sequenced and genotyped with the Illumina BovineSNP50 v2 (50K). The average imputation accuracy measured by the Pearson correlation (R) was 0.695 with the multi-breed reference, 0.876 with the purebred Hanwoo, and 0.887 with the combined data; the average concordance rates (CR) were 88.16%, 94.49%, and 94.84%, respectively. The accuracy gains from adding a large multi-breed reference of 6269 samples to only 88 Hanwoo was marginal; however, the concordance rate for the heterozygotes decreased from 85% to 82%, and the concordance rate for fixed SNPs in Hanwoo also decreased from 99.98% to 98.73%. Although the multi-breed panel was large, it was not sufficiently representative of the breed for accurate imputation without the Hanwoo animals. Additionally, we evaluated the value of high-density 700K genotypes ( = 991) as an intermediary step in the imputation process. The imputation accuracy differences were negligible between a single-step imputation strategy from 50K directly to sequence and a two-step imputation approach (50K-700K-sequence). We also observed that imputed sequence data can be used as a reference panel for imputation (mean R = 0.9650, mean CR = 98.35%). Finally, we identified 31 poorly imputed genomic regions in the Hanwoo genome and demonstrated that imputation accuracies were particularly lower at the chromosomal ends.

摘要

本研究评估了使用不同参考面板对韩牛进行序列填充的准确性

一个不包含韩牛的大型多品种参考面板( = 6269)、一个规模小得多的韩牛纯种参考面板( = 88)以及两个数据集合并后的面板( = 6357)。目标动物为136头牛,这些牛同时使用Illumina BovineSNP50 v2(50K)进行了测序和基因分型。通过皮尔逊相关系数(R)衡量的平均填充准确性,使用多品种参考面板时为0.695,使用韩牛纯种参考面板时为0.876,使用合并数据时为0.887;平均一致性率(CR)分别为88.16%、94.49%和94.84%。将包含6269个样本的大型多品种参考面板添加到仅88个韩牛样本中所带来的准确性提升微不足道;然而,杂合子的一致性率从85%降至82%,韩牛中固定SNP的一致性率也从99.98%降至98.73%。尽管多品种面板规模较大,但在没有韩牛个体的情况下,它对于准确填充而言对该品种的代表性不足。此外,我们评估了高密度700K基因型( = 991)作为填充过程中间步骤的价值。从50K直接单步填充到序列与两步填充方法(50K - 700K - 序列)之间的填充准确性差异可忽略不计。我们还观察到,填充后的序列数据可作为填充的参考面板(平均R = 0.9650,平均CR = 98.35%)。最后,我们在韩牛基因组中鉴定出31个填充效果较差的基因组区域,并证明在染色体末端填充准确性尤其较低。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0adf/9454883/a0c61ce1d01f/animals-12-02265-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验