Suppr超能文献

利用群体测序数据进行 SNP 的精确检测和基因分型。

Accurate detection and genotyping of SNPs utilizing population sequencing data.

机构信息

Scripps Translational Science Institute, The Scripps Research Institute, La Jolla, CA 92037, USA.

出版信息

Genome Res. 2010 Apr;20(4):537-45. doi: 10.1101/gr.100040.109. Epub 2010 Feb 11.

Abstract

Next-generation sequencing technologies have made it possible to sequence targeted regions of the human genome in hundreds of individuals. Deep sequencing represents a powerful approach for the discovery of the complete spectrum of DNA sequence variants in functionally important genomic intervals. Current methods for single nucleotide polymorphism (SNP) detection are designed to detect SNPs from single individual sequence data sets. Here, we describe a novel method SNIP-Seq (single nucleotide polymorphism identification from population sequence data) that leverages sequence data from a population of individuals to detect SNPs and assign genotypes to individuals. To evaluate our method, we utilized sequence data from a 200-kilobase (kb) region on chromosome 9p21 of the human genome. This region was sequenced in 48 individuals (five sequenced in duplicate) using the Illumina GA platform. Using this data set, we demonstrate that our method is highly accurate for detecting variants and can filter out false SNPs that are attributable to sequencing errors. The concordance of sequencing-based genotype assignments between duplicate samples was 98.8%. The 200-kb region was independently sequenced to a high depth of coverage using two sequence pools containing the 48 individuals. Many of the novel SNPs identified by SNIP-Seq from the individual sequencing were validated by the pooled sequencing data and were subsequently confirmed by Sanger sequencing. We estimate that SNIP-Seq achieves a low false-positive rate of approximately 2%, improving upon the higher false-positive rate for existing methods that do not utilize population sequence data. Collectively, these results suggest that analysis of population sequencing data is a powerful approach for the accurate detection of SNPs and the assignment of genotypes to individual samples.

摘要

下一代测序技术使得对数百个人类基因组的靶向区域进行测序成为可能。深度测序是发现功能重要基因组间隔中完整 DNA 序列变异谱的强大方法。当前用于单核苷酸多态性 (SNP) 检测的方法旨在从单个个体序列数据集检测 SNP。在这里,我们描述了一种新的方法 SNIP-Seq(从群体序列数据中识别单核苷酸多态性),该方法利用来自个体群体的序列数据来检测 SNP 并为个体分配基因型。为了评估我们的方法,我们利用了人类基因组 9p21 染色体上 200 千碱基 (kb) 区域的序列数据。该区域使用 Illumina GA 平台在 48 个人(5 个重复测序)中进行了测序。使用该数据集,我们证明了我们的方法在检测变体方面非常准确,可以过滤掉归因于测序错误的假 SNP。重复样本之间基于测序的基因型分配的一致性为 98.8%。该 200-kb 区域使用包含 48 个人的两个序列池进行了深度测序。从个体测序中通过 SNIP-Seq 识别的许多新 SNP 通过池测序数据得到了验证,并随后通过 Sanger 测序得到了确认。我们估计 SNIP-Seq 的假阳性率约为 2%,低于不利用群体序列数据的现有方法的更高假阳性率。总体而言,这些结果表明,分析群体测序数据是一种准确检测 SNP 和为个体样本分配基因型的强大方法。

相似文献

引用本文的文献

4
Diagnosis of cerebral malaria: Tools to reduce associated mortality.脑型疟疾的诊断:降低相关死亡率的工具。
Front Cell Infect Microbiol. 2023 Feb 9;13:1090013. doi: 10.3389/fcimb.2023.1090013. eCollection 2023.
9
Development of sequence-based markers for seed protein content in pigeonpea.基于序列的鸽豆种子蛋白含量标记的开发。
Mol Genet Genomics. 2019 Feb;294(1):57-68. doi: 10.1007/s00438-018-1484-8. Epub 2018 Sep 1.

本文引用的文献

1
Genotype imputation.基因型推算
Annu Rev Genomics Hum Genet. 2009;10:387-406. doi: 10.1146/annurev.genom.9.081307.164242.
2
Methods for genomic partitioning.基因组分区方法。
Annu Rev Genomics Hum Genet. 2009;10:263-84. doi: 10.1146/annurev-genom-082908-150112.
4
SOAP2: an improved ultrafast tool for short read alignment.SOAP2:一种用于短读序列比对的改进型超快速工具。
Bioinformatics. 2009 Aug 1;25(15):1966-7. doi: 10.1093/bioinformatics/btp336. Epub 2009 Jun 3.
5
Fast and accurate short read alignment with Burrows-Wheeler transform.使用Burrows-Wheeler变换进行快速准确的短读比对。
Bioinformatics. 2009 Jul 15;25(14):1754-60. doi: 10.1093/bioinformatics/btp324. Epub 2009 May 18.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验