Korn Joshua M, Kuruvilla Finny G, McCarroll Steven A, Wysoker Alec, Nemesh James, Cawley Simon, Hubbell Earl, Veitch Jim, Collins Patrick J, Darvishi Katayoon, Lee Charles, Nizzari Marcia M, Gabriel Stacey B, Purcell Shaun, Daly Mark J, Altshuler David
Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts 02142, USA.
Nat Genet. 2008 Oct;40(10):1253-60. doi: 10.1038/ng.237. Epub 2008 Sep 7.
Accurate and complete measurement of single nucleotide (SNP) and copy number (CNV) variants, both common and rare, will be required to understand the role of genetic variation in disease. We present Birdsuite, a four-stage analytical framework instantiated in software for deriving integrated and mutually consistent copy number and SNP genotypes. The method sequentially assigns copy number across regions of common copy number polymorphisms (CNPs), calls genotypes of SNPs, identifies rare CNVs via a hidden Markov model (HMM), and generates an integrated sequence and copy number genotype at every locus (for example, including genotypes such as A-null, AAB and BBB in addition to AA, AB and BB calls). Such genotypes more accurately depict the underlying sequence of each individual, reducing the rate of apparent mendelian inconsistencies. The Birdsuite software is applied here to data from the Affymetrix SNP 6.0 array. Additionally, we describe a method, implemented in PLINK, to utilize these combined SNP and CNV genotypes for association testing with a phenotype.
为了理解基因变异在疾病中的作用,需要准确、完整地测量单核苷酸(SNP)和拷贝数(CNV)变异,包括常见变异和罕见变异。我们展示了Birdsuite,这是一个四阶段分析框架,通过软件实例化,用于推导整合且相互一致的拷贝数和SNP基因型。该方法依次在常见拷贝数多态性(CNP)区域分配拷贝数,调用SNP的基因型,通过隐马尔可夫模型(HMM)识别罕见的CNV,并在每个位点生成整合的序列和拷贝数基因型(例如,除了AA、AB和BB型外,还包括A-null、AAB和BBB等基因型)。这样的基因型能更准确地描绘每个个体的潜在序列,降低明显的孟德尔不一致率。Birdsuite软件在此应用于Affymetrix SNP 6.0阵列的数据。此外,我们描述了一种在PLINK中实现的方法,利用这些组合的SNP和CNV基因型进行与表型的关联测试。