猪全基因组重测序的最佳测序深度设计。

Optimal sequencing depth design for whole genome re-sequencing in pigs.

机构信息

National Engineering Laboratory for Animal Breeding, Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China.

Shandong Provincial Key Laboratory of Animal Biotechnology and Disease Control and Prevention, College of Animal Science and Technology, Shandong Agricultural University, Taian, 271001, China.

出版信息

BMC Bioinformatics. 2019 Nov 8;20(1):556. doi: 10.1186/s12859-019-3164-z.

DOI:10.1186/s12859-019-3164-z

PMID:31703550

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6839175/

Abstract

BACKGROUND

As whole-genome sequencing is becoming a routine technique, it is important to identify a cost-effective depth of sequencing for such studies. However, the relationship between sequencing depth and biological results from the aspects of whole-genome coverage, variant discovery power and the quality of variants is unclear, especially in pigs. We sequenced the genomes of three Yorkshire boars at an approximately 20X depth on the Illumina HiSeq X Ten platform and downloaded whole-genome sequencing data for three Duroc and three Landrace pigs with an approximately 20X depth for each individual. Then, we downsampled the deep genome data by extracting twelve different proportions of 0.05, 0.1, 0.15, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8 and 0.9 paired reads from the original bam files to mimic the sequence data of the same individuals at sequencing depths of 1.09X, 2.18X, 3.26X, 4.35X, 6.53X, 8.70X, 10.88X, 13.05X, 15.22X, 17.40X, 19.57X and 21.75X to evaluate the influence of genome coverage, the variant discovery rate and genotyping accuracy as a function of sequencing depth. In addition, SNP chip data for Yorkshire pigs were used as a validation for the comparison of single-sample calling and multisample calling algorithms.

RESULTS

Our results indicated that 10X is an ideal practical depth for achieving plateau coverage and discovering accurate variants, which achieved greater than 99% genome coverage. The number of false-positive variants was increased dramatically at a depth of less than 4X, which covered 95% of the whole genome. In addition, the comparison of multi- and single-sample calling showed that multisample calling was more sensitive than single-sample calling, especially at lower depths. The number of variants discovered under multisample calling was 13-fold and 2-fold higher than that under single-sample calling at 1X and 22X, respectively. A large difference was observed when the depth was less than 4.38X. However, more false-positive variants were detected under multisample calling.

CONCLUSIONS

Our research will inform important study design decisions regarding whole-genome sequencing depth. Our results will be helpful for choosing the appropriate depth to achieve the same power for studies performed under limited budgets.

摘要

背景

随着全基因组测序成为一种常规技术，确定这种研究的经济有效的测序深度非常重要。然而，从全基因组覆盖度、变异发现能力和变异质量等方面来看，测序深度与生物学结果之间的关系尚不清楚，尤其是在猪中。我们在 Illumina HiSeq X Ten 平台上对 3 头约克夏猪进行了大约 20X 的基因组测序，并为 3 头杜洛克猪和 3 头长白猪下载了每个个体大约 20X 的全基因组测序数据。然后，我们通过从原始 bam 文件中提取 0.05、0.1、0.15、0.2、0.3、0.4、0.5、0.6、0.7、0.8 和 0.9 对配对读取的 12 种不同比例，从原始 bam 文件中提取 12 种不同比例的 0.05、0.1、0.15、0.2、0.3、0.4、0.5、0.6、0.7、0.8 和 0.9 对配对读取，模拟相同个体在测序深度为 1.09X、2.18X、3.26X、4.35X、6.53X、8.70X、10.88X、13.05X、15.22X、17.40X、19.57X 和 21.75X 时的序列数据，以评估基因组覆盖度、变异发现率和基因分型准确性随测序深度的变化。此外，还使用约克夏猪的 SNP 芯片数据对单样本调用和多样本调用算法的比较进行了验证。

结果

我们的结果表明，10X 是实现平台覆盖度和发现准确变异的理想实用深度，达到了大于 99%的基因组覆盖度。在深度小于 4X 时，假阳性变异的数量显著增加，覆盖了整个基因组的 95%。此外，多样本调用和单样本调用的比较表明，多样本调用比单样本调用更敏感，尤其是在较低的深度下。在 1X 和 22X 时，多样本调用发现的变异数量分别比单样本调用高 13 倍和 2 倍。在深度小于 4.38X 时，差异较大。然而，多样本调用检测到的假阳性变异数量更多。

结论

我们的研究将为全基因组测序深度的重要研究设计决策提供信息。我们的结果将有助于选择在有限预算下实现相同研究能力的适当深度。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d51/6839175/7a2139970bc8/12859_2019_3164_Fig1_HTML.jpg

相似文献

Optimal sequencing depth design for whole genome re-sequencing in pigs.猪全基因组重测序的最佳测序深度设计。

BMC Bioinformatics. 2019 Nov 8;20(1):556. doi: 10.1186/s12859-019-3164-z.

Empirical evaluation of variant calling accuracy using ultra-deep whole-genome sequencing data.使用超深度全基因组测序数据进行变异调用准确性的实证评估。

Sci Rep. 2019 Feb 11;9(1):1784. doi: 10.1038/s41598-018-38346-0.

Using genotype array data to compare multi- and single-sample variant calls and improve variant call sets from deep coverage whole-genome sequencing data.利用基因型阵列数据比较多样本和单样本变异检测结果，并改进来自深度覆盖全基因组测序数据的变异检测集。

Bioinformatics. 2017 Apr 15;33(8):1147-1153. doi: 10.1093/bioinformatics/btw786.

Guidelines for whole genome bisulphite sequencing of intact and FFPET DNA on the Illumina HiSeq X Ten.Illumina HiSeq X Ten 平台上完整 DNA 和 FFPET DNA 的全基因组亚硫酸氢盐测序指南。

Epigenetics Chromatin. 2018 May 28;11(1):24. doi: 10.1186/s13072-018-0194-0.

Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals.评估全基因组测序个体中单核苷酸变异检测和基因型调用。

Bioinformatics. 2014 Jun 15;30(12):1707-13. doi: 10.1093/bioinformatics/btu067. Epub 2014 Feb 19.

An evaluation of sequencing coverage and genotyping strategies to assess neutral and adaptive diversity.评估测序覆盖度和基因分型策略，以评估中性和适应性多样性。

Mol Ecol Resour. 2019 Nov;19(6):1497-1515. doi: 10.1111/1755-0998.13070. Epub 2019 Sep 9.

An efficient and tunable parameter to improve variant calling for whole genome and exome sequencing data.一个用于改进全基因组和外显子组测序数据变异检测的高效且可调节的参数。

Genes Genomics. 2018 Jan;40(1):39-47. doi: 10.1007/s13258-017-0608-6. Epub 2017 Aug 29.

Evaluation of SNP calling using single and multiple-sample calling algorithms by validation against array base genotyping and Mendelian inheritance.通过与基于芯片的基因分型和孟德尔遗传进行验证，评估使用单样本和多样本调用算法进行单核苷酸多态性（SNP）调用的情况。

BMC Res Notes. 2014 Oct 22;7:747. doi: 10.1186/1756-0500-7-747.

Evaluation of variant calling tools for large plant genome re-sequencing.评价用于大型植物基因组重测序的变异调用工具。

BMC Bioinformatics. 2020 Aug 17;21(1):360. doi: 10.1186/s12859-020-03704-1.

Comparison of seven SNP calling pipelines for the next-generation sequencing data of chickens.比较用于鸡下一代测序数据的七种 SNP 调用管道。

PLoS One. 2022 Jan 31;17(1):e0262574. doi: 10.1371/journal.pone.0262574. eCollection 2022.

引用本文的文献

Benchmarking of low coverage sequencing workflows for precision genotyping in eggplant.茄子中用于精准基因分型的低覆盖度测序工作流程的基准测试

BMC Plant Biol. 2025 Aug 25;25(1):1125. doi: 10.1186/s12870-025-07242-x.

Leveraging Whole-Genome Resequencing to Uncover Genetic Diversity and Promote Conservation Strategies for Ruminants in Asia.利用全基因组重测序揭示亚洲反刍动物的遗传多样性并促进保护策略

Animals (Basel). 2025 Mar 13;15(6):831. doi: 10.3390/ani15060831.

Advances in Whole Genome Sequencing: Methods, Tools, and Applications in Population Genomics.全基因组测序进展：群体基因组学中的方法、工具及应用

Int J Mol Sci. 2025 Jan 4;26(1):372. doi: 10.3390/ijms26010372.

Human complex mixture analysis by "FD Multi-SNP Mixture Kit".使用“FD多单核苷酸多态性混合物检测试剂盒”进行人类复杂混合物分析。

Front Genet. 2024 Sep 27;15:1432378. doi: 10.3389/fgene.2024.1432378. eCollection 2024.

Sequencing vs. amplification for the estimation of allele dosages in sugarcane ( spp.).用于估算甘蔗（甘蔗属）等位基因剂量的测序与扩增方法比较

Appl Plant Sci. 2024 Mar 13;12(5):e11574. doi: 10.1002/aps3.11574. eCollection 2024 Sep-Oct.

Whole-genome de novo sequencing reveals genomic variants associated with differences of sex development in SRY negative pigs.全基因组从头测序揭示了与 SRY 阴性猪性别发育差异相关的基因组变异。

Biol Sex Differ. 2024 Sep 2;15(1):68. doi: 10.1186/s13293-024-00644-w.

Whole genome sequences of 70 indigenous Ethiopian cattle.70 头埃塞俄比亚本土牛的全基因组序列。

Sci Data. 2024 Jun 5;11(1):584. doi: 10.1038/s41597-024-03342-9.

Marker Density and Models to Improve the Accuracy of Genomic Selection for Growth and Slaughter Traits in Meat Rabbits.肉兔生长和屠宰性状基因组选择准确性提升的标记密度与模型

Genes (Basel). 2024 Apr 3;15(4):454. doi: 10.3390/genes15040454.

Existing and emerging biomarkers in hepatocellular carcinoma: relevance in staging, determination of minimal residual disease, and monitoring treatment response: a narrative review.肝细胞癌中现有的和新出现的生物标志物：在分期、微小残留病的确定及治疗反应监测中的相关性：一项叙述性综述

Hepatobiliary Surg Nutr. 2024 Feb 1;13(1):39-55. doi: 10.21037/hbsn-22-526. Epub 2023 May 4.

Whole-genome resource sequences of 57 indigenous Ethiopian goats.57 只埃塞俄比亚本土山羊的全基因组资源序列。

Sci Data. 2024 Jan 29;11(1):139. doi: 10.1038/s41597-024-02973-2.

本文引用的文献

Sensitivity to sequencing depth in single-cell cancer genomics.单细胞癌症基因组学中对测序深度的敏感性。

Genome Med. 2018 Apr 16;10(1):29. doi: 10.1186/s13073-018-0537-2.

Impact of sequencing depth and read length on single cell RNA sequencing data of T cells.测序深度和读长对 T 细胞单细胞 RNA 测序数据的影响。

Sci Rep. 2017 Oct 6;7(1):12781. doi: 10.1038/s41598-017-12989-x.

Optimal sequencing strategies for identifying disease-associated singletons.用于识别疾病相关单例的最佳测序策略。

PLoS Genet. 2017 Jun 22;13(6):e1006811. doi: 10.1371/journal.pgen.1006811. eCollection 2017 Jun.

A survey of single nucleotide polymorphisms identified from whole-genome sequencing and their functional effect in the porcine genome.全基因组测序鉴定出的猪基因组单核苷酸多态性及其功能效应的调查。

Anim Genet. 2017 Aug;48(4):404-411. doi: 10.1111/age.12557. Epub 2017 May 8.

GeneImp: Fast Imputation to Large Reference Panels Using Genotype Likelihoods from Ultralow Coverage Sequencing.GeneImp：利用超低覆盖度测序的基因型似然性对大型参考面板进行快速插补

Genetics. 2017 May;206(1):91-104. doi: 10.1534/genetics.117.200063. Epub 2017 Mar 27.

Low-, high-coverage, and two-stage DNA sequencing in the design of the genetic association study.遗传关联研究设计中的低覆盖度、高覆盖度和两阶段DNA测序

Genet Epidemiol. 2017 Apr;41(3):187-197. doi: 10.1002/gepi.22015. Epub 2016 Nov 4.

Genome-wide genetic variation discovery in Chinese Taihu pig breeds using next generation sequencing.利用新一代测序技术在中国太湖猪品种中进行全基因组遗传变异发现。

Anim Genet. 2017 Feb;48(1):38-47. doi: 10.1111/age.12465. Epub 2016 Jul 27.

Rapid genotype imputation from sequence without reference panels.无需参考面板即可从序列中快速进行基因型推算。

Nat Genet. 2016 Aug;48(8):965-969. doi: 10.1038/ng.3594. Epub 2016 Jul 4.

Positive selection rather than relaxation of functional constraint drives the evolution of vision during chicken domestication.正向选择而非功能限制的放松驱动了家鸡驯化过程中视觉的进化。

Cell Res. 2016 May;26(5):556-73. doi: 10.1038/cr.2016.44. Epub 2016 Apr 1.

Identification of genes for controlling swine adipose deposition by integrating transcriptome, whole-genome resequencing, and quantitative trait loci data.通过整合转录组、全基因组重测序和数量性状位点数据鉴定控制猪脂肪沉积的基因。

Sci Rep. 2016 Mar 21;6:23219. doi: 10.1038/srep23219.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

猪全基因组重测序的最佳测序深度设计。

Optimal sequencing depth design for whole genome re-sequencing in pigs.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献