Suppr超能文献

ACID:GWAS 中不平衡数据的关联校正。

ACID: Association Correction for Imbalanced Data in GWAS.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2018 Jan-Feb;15(1):316-322. doi: 10.1109/TCBB.2016.2608819. Epub 2016 Sep 13.

Abstract

Genome-wide association study (GWAS) has been widely witnessed as a powerful tool for revealing suspicious loci from various diseases. However, real world GWAS tasks always suffer from the data imbalance problem of sufficient control samples and limited case samples. This imbalance issue can cause serious biases to the result and thus leads to losses of significance for true causal markers. To tackle this problem, we proposed a computational framework to perform association correction for imbalanced data (ACID) that could potentially improve the performance of GWAS under the imbalance condition. ACID is inspired by the imbalance learning theory but is particularly modified to address the task of association discovery from sequential genomic data. Simulation studies demonstrate ACID can dramatically improve the power of traditional GWAS method on the dataset with severe imbalances. We further applied ACID to two imbalanced datasets (gastric cancer and bladder cancer) to conduct genome wide association analysis. Experimental results indicate that our method has better abilities in identifying suspicious loci than the regression approach and shows consistencies with existing discoveries.

摘要

全基因组关联研究(GWAS)已被广泛认为是揭示各种疾病可疑基因座的有力工具。然而,实际的 GWAS 任务总是受到足够的对照样本和有限的病例样本数据不平衡问题的困扰。这种不平衡问题会给结果带来严重的偏差,从而导致真正的因果标记失去显著性。为了解决这个问题,我们提出了一种计算框架来进行不平衡数据的关联校正(ACID),这可能会提高在不平衡条件下 GWAS 的性能。ACID 受到不平衡学习理论的启发,但特别针对从顺序基因组数据中发现关联的任务进行了修改。模拟研究表明,在严重不平衡的数据集上,ACID 可以显著提高传统 GWAS 方法的功效。我们进一步将 ACID 应用于两个不平衡数据集(胃癌和膀胱癌)进行全基因组关联分析。实验结果表明,我们的方法在识别可疑基因座方面比回归方法具有更好的能力,并与现有发现具有一致性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验