Suppr超能文献

基于马尔可夫毯的 GWAS 中因果 SNP 检测方法。

A Markov blanket-based method for detecting causal SNPs in GWAS.

机构信息

Department of Electrical Engineering and Computer Science, The University of Kansas, Lawrence, 66045, USA.

出版信息

BMC Bioinformatics. 2010 Apr 29;11 Suppl 3(Suppl 3):S5. doi: 10.1186/1471-2105-11-S3-S5.

Abstract

BACKGROUND

Detecting epistatic interactions associated with complex and common diseases can help to improve prevention, diagnosis and treatment of these diseases. With the development of genome-wide association studies (GWAS), designing powerful and robust computational method for identifying epistatic interactions associated with common diseases becomes a great challenge to bioinformatics society, because the study of epistatic interactions often deals with the large size of the genotyped data and the huge amount of combinations of all the possible genetic factors. Most existing computational detection methods are based on the classification capacity of SNP sets, which may fail to identify SNP sets that are strongly associated with the diseases and introduce a lot of false positives. In addition, most methods are not suitable for genome-wide scale studies due to their computational complexity.

RESULTS

We propose a new Markov Blanket-based method, DASSO-MB (Detection of ASSOciations using Markov Blanket) to detect epistatic interactions in case-control GWAS. Markov blanket of a target variable T can completely shield T from all other variables. Thus, we can guarantee that the SNP set detected by DASSO-MB has a strong association with diseases and contains fewest false positives. Furthermore, DASSO-MB uses a heuristic search strategy by calculating the association between variables to avoid the time-consuming training process as in other machine-learning methods. We apply our algorithm to simulated datasets and a real case-control dataset. We compare DASSO-MB to other commonly-used methods and show that our method significantly outperforms other methods and is capable of finding SNPs strongly associated with diseases.

CONCLUSIONS

Our study shows that DASSO-MB can identify a minimal set of causal SNPs associated with diseases, which contains less false positives compared to other existing methods. Given the huge size of genomic dataset produced by GWAS, this is critical in saving the potential costs of biological experiments and being an efficient guideline for pathogenesis research.

摘要

背景

检测与复杂和常见疾病相关的上位相互作用可以帮助改善这些疾病的预防、诊断和治疗。随着全基因组关联研究(GWAS)的发展,设计用于识别与常见疾病相关的上位相互作用的强大而稳健的计算方法成为生物信息学领域的一大挑战,因为上位相互作用的研究通常涉及到基因分型数据的大规模和所有可能遗传因素的组合的大量。大多数现有的计算检测方法都是基于 SNP 集的分类能力,这可能无法识别与疾病强烈相关的 SNP 集,并引入大量的假阳性。此外,由于计算复杂性,大多数方法不适合全基因组规模的研究。

结果

我们提出了一种新的基于马尔可夫 blankets 的方法,DASSO-MB(使用马尔可夫 blankets 检测关联),用于检测病例对照 GWAS 中的上位相互作用。目标变量 T 的马尔可夫 blankets 可以完全屏蔽 T 与所有其他变量的联系。因此,我们可以保证 DASSO-MB 检测到的 SNP 集与疾病有很强的关联,并且包含最少的假阳性。此外,DASSO-MB 通过计算变量之间的关联来使用启发式搜索策略,避免了像其他机器学习方法那样耗时的训练过程。我们将我们的算法应用于模拟数据集和真实的病例对照数据集。我们将 DASSO-MB 与其他常用方法进行比较,结果表明我们的方法显著优于其他方法,并且能够找到与疾病强烈相关的 SNPs。

结论

我们的研究表明,DASSO-MB 可以识别与疾病相关的最小一组因果 SNPs,与其他现有方法相比,它包含的假阳性更少。鉴于 GWAS 产生的基因组数据集的巨大规模,这对于节省潜在的生物学实验成本和作为发病机制研究的有效指导方针至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30e0/2863064/ed26c1e7b573/1471-2105-11-S3-S5-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验