ACID：GWAS 中不平衡数据的关联校正。

ACID: Association Correction for Imbalanced Data in GWAS.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2018 Jan-Feb;15(1):316-322. doi: 10.1109/TCBB.2016.2608819. Epub 2016 Sep 13.

DOI:10.1109/TCBB.2016.2608819

Abstract

Genome-wide association study (GWAS) has been widely witnessed as a powerful tool for revealing suspicious loci from various diseases. However, real world GWAS tasks always suffer from the data imbalance problem of sufficient control samples and limited case samples. This imbalance issue can cause serious biases to the result and thus leads to losses of significance for true causal markers. To tackle this problem, we proposed a computational framework to perform association correction for imbalanced data (ACID) that could potentially improve the performance of GWAS under the imbalance condition. ACID is inspired by the imbalance learning theory but is particularly modified to address the task of association discovery from sequential genomic data. Simulation studies demonstrate ACID can dramatically improve the power of traditional GWAS method on the dataset with severe imbalances. We further applied ACID to two imbalanced datasets (gastric cancer and bladder cancer) to conduct genome wide association analysis. Experimental results indicate that our method has better abilities in identifying suspicious loci than the regression approach and shows consistencies with existing discoveries.

摘要

全基因组关联研究（GWAS）已被广泛认为是揭示各种疾病可疑基因座的有力工具。然而，实际的 GWAS 任务总是受到足够的对照样本和有限的病例样本数据不平衡问题的困扰。这种不平衡问题会给结果带来严重的偏差，从而导致真正的因果标记失去显著性。为了解决这个问题，我们提出了一种计算框架来进行不平衡数据的关联校正（ACID），这可能会提高在不平衡条件下 GWAS 的性能。ACID 受到不平衡学习理论的启发，但特别针对从顺序基因组数据中发现关联的任务进行了修改。模拟研究表明，在严重不平衡的数据集上，ACID 可以显著提高传统 GWAS 方法的功效。我们进一步将 ACID 应用于两个不平衡数据集（胃癌和膀胱癌）进行全基因组关联分析。实验结果表明，我们的方法在识别可疑基因座方面比回归方法具有更好的能力，并与现有发现具有一致性。

相似文献

ACID: Association Correction for Imbalanced Data in GWAS.ACID：GWAS 中不平衡数据的关联校正。

IEEE/ACM Trans Comput Biol Bioinform. 2018 Jan-Feb;15(1):316-322. doi: 10.1109/TCBB.2016.2608819. Epub 2016 Sep 13.

Bosco: Boosting Corrections for Genome-Wide Association Studies With Imbalanced Samples.博斯科：针对样本不均衡的全基因组关联研究的增强校正

IEEE Trans Nanobioscience. 2017 Jan;16(1):69-77. doi: 10.1109/TNB.2017.2660498. Epub 2017 Jan 27.

parSMURF, a high-performance computing tool for the genome-wide detection of pathogenic variants.parSMURF，一种用于全基因组致病性变异检测的高性能计算工具。

Gigascience. 2020 May 1;9(5). doi: 10.1093/gigascience/giaa052.

A Markov blanket-based method for detecting causal SNPs in GWAS.基于马尔可夫毯的 GWAS 中因果 SNP 检测方法。

BMC Bioinformatics. 2010 Apr 29;11 Suppl 3(Suppl 3):S5. doi: 10.1186/1471-2105-11-S3-S5.

Multiple testing in genome-wide association studies via hidden Markov models.基于隐马尔可夫模型的全基因组关联研究中的多重检验。

Bioinformatics. 2009 Nov 1;25(21):2802-8. doi: 10.1093/bioinformatics/btp476. Epub 2009 Aug 4.

Online sequential class-specific extreme learning machine for binary imbalanced learning.在线序贯类特定极端学习机用于二进制不平衡学习。

Neural Netw. 2019 Nov;119:235-248. doi: 10.1016/j.neunet.2019.08.018. Epub 2019 Aug 23.

An algorithm for direct causal learning of influences on patient outcomes.一种用于直接因果学习对患者预后影响的算法。

Artif Intell Med. 2017 Jan;75:1-15. doi: 10.1016/j.artmed.2016.10.003. Epub 2016 Nov 5.

Pipeline design to identify key features and classify the chemotherapy response on lung cancer patients using large-scale genetic data.利用大规模基因数据进行管道设计，以识别关键特征并对肺癌患者的化疗反应进行分类。

BMC Syst Biol. 2018 Nov 20;12(Suppl 5):97. doi: 10.1186/s12918-018-0615-5.

Statistical Learning Methods Applicable to Genome-Wide Association Studies on Unbalanced Case-Control Disease Data.适用于不平衡病例对照疾病数据的全基因组关联研究的统计学习方法。

Genes (Basel). 2021 May 13;12(5):736. doi: 10.3390/genes12050736.

HapBoost: a fast approach to boosting haplotype association analyses in genome-wide association studies.HapBoost：一种用于全基因组关联研究中提升单体型关联分析的快速方法。

IEEE/ACM Trans Comput Biol Bioinform. 2013 Jan-Feb;10(1):207-12. doi: 10.1109/TCBB.2013.6.

引用本文的文献

aXonica: A support package for MRI based Neuroimaging.Axonica：一个基于磁共振成像的神经成像支持软件包。

Biotechnol Notes. 2024 Aug 22;5:120-136. doi: 10.1016/j.biotno.2024.08.001. eCollection 2024.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

ACID：GWAS 中不平衡数据的关联校正。

ACID: Association Correction for Imbalanced Data in GWAS.

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献