Suppr超能文献

实时联合基因分型:鉴定测序近交系panel 中的变异。

Joint genotyping on the fly: identifying variation among a sequenced panel of inbred lines.

机构信息

Department of Genetics and Bioinformatics Research Center, North Carolina State University, Raleigh, NC 27603, USA.

出版信息

Genome Res. 2012 May;22(5):966-74. doi: 10.1101/gr.129122.111. Epub 2012 Feb 23.

Abstract

High-throughput sequencing is enabling remarkably deep surveys of genomic variation. It is now possible to completely sequence multiple individuals from a single species, yet the identification of variation among them remains an evolving computational challenge. This challenge is compounded for experimental organisms when strains are studied instead of individuals. In response, we present the Joint Genotyper for Inbred Lines (JGIL) as a method for obtaining genotypes and identifying variation among a large panel of inbred strains or lines. JGIL inputs the sequence reads from each line after their alignment to a common reference. Its probabilistic model includes site-specific parameters common to all lines that describe the frequency of nucleotides segregating in the population from which the inbred panel was derived. The distribution of line genotypes is conditional on these parameters and reflects the experimental design. Site-specific error probabilities, also common to all lines, parameterize the distribution of reads conditional on line genotype and realized coverage. Both sets of parameters are estimated per site from the aggregate read data, and posterior probabilities are calculated to decode the genotype of each line. We present an application of JGIL to 162 inbred Drosophila melanogaster lines from the Drosophila Genetic Reference Panel. We explore by simulation the effect of varying coverage, sequencing error, mapping error, and the number of lines. In doing so, we illustrate how JGIL is robust to moderate levels of error. Supported by these analyses, we advocate the importance of modeling the data and the experimental design when possible.

摘要

高通量测序技术使得对基因组变异进行深度调查成为可能。现在,从单个物种中完全测序多个个体是可能的,但识别它们之间的变异仍然是一个不断发展的计算挑战。当研究的是品系而不是个体时,这个挑战对于实验生物来说更加复杂。有鉴于此,我们提出了用于近交系的联合基因型推断器(Joint Genotyper for Inbred Lines,JGIL),作为一种从大量近交系或品系中获取基因型并识别变异的方法。JGIL 在将每条线的序列读取对齐到公共参考之后输入。它的概率模型包括适用于所有线的特定于位点的参数,这些参数描述了从近交系面板衍生的群体中分离的核苷酸的频率。线基因型的分布取决于这些参数,并反映了实验设计。特定于位点的错误概率也适用于所有线路,它对线基因型和实现的覆盖条件下的读取分布进行参数化。这两组参数都根据汇总的读取数据在每个位点进行估计,并计算后验概率以解码每条线的基因型。我们将 JGIL 应用于来自 Drosophila Genetic Reference Panel 的 162 条近交 Drosophila melanogaster 品系。我们通过模拟来探索覆盖范围、测序错误、映射错误和线数变化的影响。通过这样做,我们说明了 JGIL 如何在中等水平的误差下保持稳健。基于这些分析,我们提倡在可能的情况下对数据和实验设计进行建模的重要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2691/3337441/b806e8f4551f/966fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验