Suppr超能文献

通过双单倍体的限制性内切酶位点关联DNA测序生成的虹鳟单核苷酸多态性资源。

A resource of single-nucleotide polymorphisms for rainbow trout generated by restriction-site associated DNA sequencing of doubled haploids.

作者信息

Palti Yniv, Gao Guangtu, Miller Michael R, Vallejo Roger L, Wheeler Paul A, Quillet Edwige, Yao Jianbo, Thorgaard Gary H, Salem Mohamed, Rexroad Caird E

机构信息

National Center for Cool and Cold Water Aquaculture, ARS-USDA, 11861 Leetown Road, Kearneysville, WV, 25430, USA.

出版信息

Mol Ecol Resour. 2014 May;14(3):588-96. doi: 10.1111/1755-0998.12204. Epub 2013 Dec 13.

Abstract

Salmonid genomes are considered to be in a pseudo-tetraploid state as a result of a genome duplication event that occurred between 25 and 100 Ma. This situation complicates single-nucleotide polymorphism (SNP) discovery in rainbow trout as many putative SNPs are actually paralogous sequence variants (PSVs) and not simple allelic variants. To differentiate PSVs from simple allelic variants, we used 19 homozygous doubled haploid (DH) lines that represent a wide geographical range of rainbow trout populations. In the first phase of the study, we analysed SbfI restriction-site associated DNA (RAD) sequence data from all the 19 lines and selected 11 lines for an extended SNP discovery. In the second phase, we conducted the extended SNP discovery using PstI RAD sequence data from the selected 11 lines. The complete data set is composed of 145,168 high-quality putative SNPs that were genotyped in at least nine of the 11 lines, of which 71,446 (49%) had minor allele frequencies (MAF) of at least 18% (i.e. at least two of the 11 lines). Approximately 14% of the RAD SNPs in this data set are from expressed or coding rainbow trout sequences. Our comparison of the current data set with previous SNP discovery data sets revealed that 99% of our SNPs are novel. In the support files for this resource, we provide annotation to the positions of the SNPs in the working draft of the rainbow trout reference genome, provide the genotypes of each sample in the discovery panel and identify SNPs that are likely to be in coding sequences.

摘要

由于在2500万至1亿年前发生了一次基因组复制事件,鲑科鱼类基因组被认为处于假四倍体状态。这种情况使虹鳟鱼单核苷酸多态性(SNP)的发现变得复杂,因为许多推定的SNP实际上是旁系同源序列变体(PSV),而不是简单的等位基因变体。为了区分PSV和简单的等位基因变体,我们使用了19个纯合双单倍体(DH)品系,它们代表了虹鳟鱼种群广泛的地理范围。在研究的第一阶段,我们分析了所有19个品系的SbfI限制性内切酶位点相关DNA(RAD)序列数据,并选择了11个品系进行扩展SNP发现。在第二阶段,我们使用所选11个品系的PstI RAD序列数据进行扩展SNP发现。完整的数据集由145,168个高质量的推定SNP组成,这些SNP在11个品系中的至少9个中进行了基因分型,其中71,446个(49%)的次要等位基因频率(MAF)至少为18%(即11个品系中的至少两个)。该数据集中约14%的RAD SNP来自表达的或编码的虹鳟鱼序列。我们将当前数据集与以前的SNP发现数据集进行比较,发现99%的SNP是新的。在该资源的支持文件中,我们提供了虹鳟鱼参考基因组工作草图中SNP位置的注释,提供了发现面板中每个样本的基因型,并识别了可能存在于编码序列中的SNP。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验