Yan Aimin, Laird Nan M, Li Cheng
Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue, Boston, MA 02115, USA.
BMC Proc. 2011 Nov 29;5 Suppl 9(Suppl 9):S99. doi: 10.1186/1753-6561-5-S9-S99.
Recent advances in next-generation sequencing technologies have made it possible to generate large amounts of sequence data with rare variants in a cost-effective way. Statistical methods that test variants individually are underpowered to detect rare variants, so it is desirable to perform association analysis of rare variants by combining the information from all variants. In this study, we use a Bayesian regression method to model all variants simultaneously to identify rare variants in a data set from Genetic Analysis Workshop 17. We studied the association between the quantitative risk traits Q1, Q2, and Q4 and the single-nucleotide polymorphisms and identified several positive single-nucleotide polymorphisms for traits Q1 and Q2. However, the model also generated several apparent false positives and missed many true positives, suggesting that there is room for improvement in this model.
新一代测序技术的最新进展使得以经济高效的方式生成包含罕见变异的大量序列数据成为可能。单独检测变异的统计方法在检测罕见变异方面能力不足,因此通过整合所有变异的信息来进行罕见变异的关联分析是很有必要的。在本研究中,我们使用贝叶斯回归方法对所有变异进行同时建模,以在遗传分析研讨会17的一个数据集中识别罕见变异。我们研究了定量风险性状Q1、Q2和Q4与单核苷酸多态性之间的关联,并识别出了性状Q1和Q2的几个阳性单核苷酸多态性。然而,该模型也产生了一些明显的假阳性结果,并且遗漏了许多真阳性结果,这表明该模型仍有改进的空间。