Suppr超能文献

ddRAD-seq 变异调用和基因分型准确性:与 20X WGS 在层中的比较。

Variant calling and genotyping accuracy of ddRAD-seq: Comparison with 20X WGS in layers.

机构信息

PEGASE, INRAE, Institut Agro, Saint Gilles, France.

MGX-Montpellier GenomiX, Univ. Montpellier, CNRS, INSERM, Montpellier, France.

出版信息

PLoS One. 2024 Jul 26;19(7):e0298565. doi: 10.1371/journal.pone.0298565. eCollection 2024.

Abstract

Whole Genome Sequencing (WGS) remains a costly or unsuitable method for routine genotyping of laying hens. Until now, breeding companies have been using or developing SNP chips. Nevertheless, alternatives methods based on sequencing have been developed. Among these, reduced representation sequencing approaches can offer sequencing quality and cost-effectiveness by reducing the genomic regions covered by sequencing. The aim of this study was to evaluate the ability of double digested Restriction site Associated DNA sequencing (ddRAD-seq) to identify and genotype SNPs in laying hens, by comparison with a presumed reliable WGS approach. Firstly, the sensitivity and precision of variant calling and the genotyping reliability of ddRADseq were determined. Next, the SNP Call Rate (CRSNP) and mean depth of sequencing per SNP (DPSNP) were compared between both methods. Finally, the effect of multiple combinations of thresholds for these parameters on genotyping reliability and amount of remaining SNPs in ddRAD-seq was studied. In raw form, the ddRAD-seq identified 349,497 SNPs evenly distributed on the genome with a CRSNP of 0.55, a DPSNP of 11X and a mean genotyping reliability rate per SNP of 80%. Considering genomic regions covered by expected enzymatic fragments (EFs), the sensitivity of the ddRAD-seq was estimated at 32.4% and its precision at 96.4%. The low CRSNP and DPSNP values were explained by the detection of SNPs outside the EFs theoretically generated by the ddRAD-seq protocol. Indeed, SNPs outside the EFs had significantly lower CRSNP (0.25) and DPSNP (1X) values than SNPs within the EFs (0.7 and 17X, resp.). The study demonstrated the relationship between CRSNP, DPSNP, genotyping reliability and the number of SNPs retained, to provide a decision-support tool for defining filtration thresholds. Severe quality control over ddRAD-seq data allowed to retain a minimum of 40% of the SNPs with a CcR of 98%. Then, ddRAD-seq was defined as a suitable method for variant calling and genotyping in layers.

摘要

全基因组测序(WGS)仍然是一种昂贵或不适合常规蛋鸡基因分型的方法。到目前为止,养殖公司一直在使用或开发 SNP 芯片。然而,已经开发出了基于测序的替代方法。其中,减少代表性测序方法可以通过减少测序覆盖的基因组区域来提供测序质量和成本效益。本研究旨在通过与假定可靠的 WGS 方法进行比较,评估双酶切限制性位点相关 DNA 测序(ddRAD-seq)识别和基因分型蛋鸡 SNP 的能力。首先,确定了变异调用的灵敏度和精度以及 ddRADseq 的基因分型可靠性。接下来,比较了两种方法的 SNP 调用率(CRSNP)和每个 SNP 的平均测序深度(DPSNP)。最后,研究了这些参数的多个阈值组合对 ddRAD-seq 基因分型可靠性和剩余 SNP 数量的影响。在原始形式下,ddRAD-seq 在基因组上均匀分布了 349,497 个 SNP,CRSNP 为 0.55,DPSNP 为 11X,每个 SNP 的平均基因分型可靠性率为 80%。考虑到预期酶片段(EFs)覆盖的基因组区域,ddRAD-seq 的灵敏度估计为 32.4%,精度为 96.4%。ddRAD-seq 协议理论上产生的 EF 之外 SNP 的检测解释了低 CRSNP 和 DPSNP 值。实际上,EF 之外的 SNPs 的 CRSNP(0.25)和 DPSNP(1X)值明显低于 EF 内的 SNPs(0.7 和 17X,分别)。该研究表明了 CRSNP、DPSNP、基因分型可靠性和保留 SNP 数量之间的关系,为定义过滤阈值提供了决策支持工具。对 ddRAD-seq 数据进行严格的质量控制,可保留至少 40%的 CcR 为 98%的 SNP。然后,ddRAD-seq 被定义为一种适合蛋鸡变异调用和基因分型的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0300/11280156/43c4f000f20b/pone.0298565.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验