Suppr超能文献

短季大豆全基因组核苷酸和结构变异的综合描述。

Comprehensive description of genomewide nucleotide and structural variation in short-season soya bean.

机构信息

Département de Phytologie, Université Laval, Quebec City, QC, Canada.

Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Quebec City, QC, Canada.

出版信息

Plant Biotechnol J. 2018 Mar;16(3):749-759. doi: 10.1111/pbi.12825. Epub 2017 Nov 3.

Abstract

Next-generation sequencing (NGS) and bioinformatics tools have greatly facilitated the characterization of nucleotide variation; nonetheless, an exhaustive description of both SNP haplotype diversity and of structural variation remains elusive in most species. In this study, we sequenced a representative set of 102 short-season soya beans and achieved an extensive coverage of both nucleotide diversity and structural variation (SV). We called close to 5M sequence variants (SNPs, MNPs and indels) and noticed that the number of unique haplotypes had plateaued within this set of germplasm (1.7M tag SNPs). This data set proved highly accurate (98.6%) based on a comparison of called genotypes at loci shared with a SNP array. We used this catalogue of SNPs as a reference panel to impute missing genotypes at untyped loci in data sets derived from lower density genotyping tools (150 K GBS-derived SNPs/530 samples). After imputation, 96.4% of the missing genotypes imputed in this fashion proved to be accurate. Using a combination of three bioinformatics pipelines, we uncovered ~92 K SVs (deletions, insertions, inversions, duplications, CNVs and translocations) and estimated that over 90% of these were accurate. Finally, we noticed that the duplication of certain genomic regions explained much of the residual heterozygosity at SNP loci in otherwise highly inbred soya bean accessions. This is the first time that a comprehensive description of both SNP haplotype diversity and SV has been achieved within a regionally relevant subset of a major crop.

摘要

下一代测序(NGS)和生物信息学工具极大地促进了核苷酸变异的特征描述;尽管如此,在大多数物种中,仍然难以详尽描述 SNP 单倍型多样性和结构变异。在这项研究中,我们对 102 份短季大豆进行了代表性测序,实现了核苷酸多样性和结构变异(SV)的广泛覆盖。我们总共鉴定了近 500 万个序列变异(SNP、MNPs 和 indels),并注意到在这组种质中独特单倍型的数量已经达到了饱和(170 万个标签 SNP)。基于与 SNP 芯片共享的位点的基因型调用的比较,该数据集被证明具有极高的准确性(98.6%)。我们使用这个 SNP 目录作为参考面板,对来自较低密度基因分型工具(150K GBS 衍生 SNP/530 个样本)的数据集中未分型位点的缺失基因型进行了推断。通过推断,以这种方式推断的缺失基因型中,有 96.4%是准确的。通过结合三种生物信息学管道,我们发现了大约 92000 个 SV(缺失、插入、反转、重复、CNV 和易位),并估计其中超过 90%是准确的。最后,我们注意到某些基因组区域的重复解释了在其他高度自交大豆品系中 SNP 位点的大部分剩余杂合性。这是首次在主要作物的一个区域相关子集中实现 SNP 单倍型多样性和 SV 的全面描述。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/36a3/11388520/ed425635b477/PBI-16-749-g004.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验