Suppr超能文献

基于超二项分布变异的合并 DNA 测序数据分析方法

Extra-binomial variation approach for analysis of pooled DNA sequencing data.

机构信息

Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Cambridge CB2 0XY, UK.

出版信息

Bioinformatics. 2012 Nov 15;28(22):2898-904. doi: 10.1093/bioinformatics/bts553. Epub 2012 Sep 12.

Abstract

MOTIVATION

The invention of next-generation sequencing technology has made it possible to study the rare variants that are more likely to pinpoint causal disease genes. To make such experiments financially viable, DNA samples from several subjects are often pooled before sequencing. This induces large between-pool variation which, together with other sources of experimental error, creates over-dispersed data. Statistical analysis of pooled sequencing data needs to appropriately model this additional variance to avoid inflating the false-positive rate.

RESULTS

We propose a new statistical method based on an extra-binomial model to address the over-dispersion and apply it to pooled case-control data. We demonstrate that our model provides a better fit to the data than either a standard binomial model or a traditional extra-binomial model proposed by Williams and can analyse both rare and common variants with lower or more variable pool depths compared to the other methods.

AVAILABILITY

Package 'extraBinomial' is on http://cran.r-project.org/.

CONTACT

chris.wallace@cimr.cam.ac.uk.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics Online.

摘要

动机

下一代测序技术的发明使得研究更有可能确定致病基因的罕见变异成为可能。为了使这些实验在经济上可行,通常在测序前将来自几个主体的 DNA 样本混合。这会引起较大的组间变异,再加上其他来源的实验误差,会导致过度分散的数据。对混合测序数据进行统计分析需要适当建模这种额外的方差,以避免虚报阳性率。

结果

我们提出了一种基于超二项式模型的新统计方法来解决过度分散问题,并将其应用于混合病例对照数据。我们证明,与标准二项式模型或 Williams 提出的传统超二项式模型相比,我们的模型对数据的拟合更好,并且与其他方法相比,可以分析罕见和常见变异,并且组深度较低或更可变。

可用性

'extraBinomial' 包可在 http://cran.r-project.org/ 上获得。

联系方式

chris.wallace@cimr.cam.ac.uk

补充信息

补充数据可在 Bioinformatics Online 上获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验