Suppr超能文献

小鼠中的高通量测序:平台比较揭示了大量隐匿性单核苷酸多态性。

High throughput sequencing in mice: a platform comparison identifies a preponderance of cryptic SNPs.

作者信息

Walter Nicole A R, Bottomly Daniel, Laderas Ted, Mooney Michael A, Darakjian Priscila, Searles Robert P, Harrington Christina A, McWeeney Shannon K, Hitzemann Robert, Buck Kari J

机构信息

Research and Development Service, Portland VA Medical Center, Portland, OR, USA.

出版信息

BMC Genomics. 2009 Aug 17;10:379. doi: 10.1186/1471-2164-10-379.

Abstract

BACKGROUND

Allelic variation is the cornerstone of genetically determined differences in gene expression, gene product structure, physiology, and behavior. However, allelic variation, particularly cryptic (unknown or not annotated) variation, is problematic for follow up analyses. Polymorphisms result in a high incidence of false positive and false negative results in hybridization based analyses and hinder the identification of the true variation underlying genetically determined differences in physiology and behavior. Given the proliferation of mouse genetic models (e.g., knockout models, selectively bred lines, heterogeneous stocks derived from standard inbred strains and wild mice) and the wealth of gene expression microarray and phenotypic studies using genetic models, the impact of naturally-occurring polymorphisms on these data is critical. With the advent of next-generation, high-throughput sequencing, we are now in a position to determine to what extent polymorphisms are currently cryptic in such models and their impact on downstream analyses.

RESULTS

We sequenced the two most commonly used inbred mouse strains, DBA/2J and C57BL/6J, across a region of chromosome 1 (171.6 - 174.6 megabases) using two next generation high-throughput sequencing platforms: Applied Biosystems (SOLiD) and Illumina (Genome Analyzer). Using the same templates on both platforms, we compared realignments and single nucleotide polymorphism (SNP) detection with an 80 fold average read depth across platforms and samples. While public datasets currently annotate 4,527 SNPs between the two strains in this interval, thorough high-throughput sequencing identified a total of 11,824 SNPs in the interval, including 7,663 new SNPs. Furthermore, we confirmed 40 missense SNPs and discovered 36 new missense SNPs.

CONCLUSION

Comparisons utilizing even two of the best characterized mouse genetic models, DBA/2J and C57BL/6J, indicate that more than half of naturally-occurring SNPs remain cryptic. The magnitude of this problem is compounded when using more divergent or poorly annotated genetic models. This warrants full genomic sequencing of the mouse strains used as genetic models.

摘要

背景

等位基因变异是基因表达、基因产物结构、生理学和行为方面遗传决定差异的基石。然而,等位基因变异,尤其是隐性(未知或未注释)变异,对于后续分析来说是个难题。多态性在基于杂交的分析中导致高比例的假阳性和假阴性结果,并阻碍对生理学和行为方面遗传决定差异背后真实变异的识别。鉴于小鼠遗传模型(如基因敲除模型、选择性培育品系、源自标准近交系和野生小鼠的异质群体)的激增以及使用遗传模型进行的大量基因表达微阵列和表型研究,自然发生的多态性对这些数据的影响至关重要。随着新一代高通量测序技术的出现,我们现在能够确定在此类模型中多态性目前在多大程度上是隐性的以及它们对下游分析的影响。

结果

我们使用两个新一代高通量测序平台:应用生物系统公司(SOLiD)和Illumina公司(基因组分析仪),对两个最常用的近交小鼠品系DBA/2J和C57BL/6J的1号染色体区域(171.6 - 174.6兆碱基)进行了测序。在两个平台上使用相同的模板,我们比较了重新比对和单核苷酸多态性(SNP)检测,平均读深度在各平台和样本间为80倍。虽然当前公共数据集注释了该区间内两个品系之间的4527个SNP,但全面的高通量测序在该区间共鉴定出11824个SNP,包括7663个新SNP。此外,我们确认了40个错义SNP并发现了36个新的错义SNP。

结论

即使利用两个特征最明确的小鼠遗传模型DBA/2J和C57BL/6J进行比较,也表明超过一半的自然发生SNP仍然是隐性的。当使用差异更大或注释不佳的遗传模型时,这个问题的严重性会加剧。这就需要对用作遗传模型的小鼠品系进行全基因组测序。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b77/2743714/173ce0be937a/1471-2164-10-379-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验