Suppr超能文献

EpiMC:使用多种聚类方法检测上位性相互作用。

EpiMC: Detecting Epistatic Interactions Using Multiple Clusterings.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2022 Jan-Feb;19(1):243-254. doi: 10.1109/TCBB.2021.3080462. Epub 2022 Feb 3.

Abstract

Detecting single nucleotide polymorphisms (SNPs) interactions is crucial to identify susceptibility genes associated with complex human diseases in genome-wide association studies. Clustering-based approaches are widely used in reducing search space and exploring potential relationships between SNPs in epistasis analysis. However, these approaches all only use a single measure to filter out nonsignificant SNP combinations, which may be significant ones from another perspective. In this paper, we propose a two-stage approach named EpiMC (Epistatic Interactions detection based on Multiple Clusterings) that employs multiple clusterings to obtain more precise candidate sets and more comprehensively detect high-order interactions based on these sets. In the first stage, EpiMC proposes a matrix factorization based multiple clusterings algorithm to generate multiple diverse clusterings, each of which divide all SNPs into different clusters. This stage aims to reduce the chance of filtering out potential candidates overlooked by a single clustering and groups associated SNPs together from different clustering perspectives. In the next stage, EpiMC considers both the single-locus effects and interaction effects to select high-quality disease associated SNPs, and then uses Jaccard similarity to get candidate sets. Finally, EpiMC uses exhaustive search on the obtained small candidate sets to precisely detect epsitatic interactions. Extensive simulation experiments show that EpiMC has a better performance in detecting high-order interactions than state-of-the-art solutions. On the Wellcome Trust Case Control Consortium (WTCCC) dataset, EpiMC detects several significant epistatic interactions associated with breast cancer (BC) and age-related macular degeneration (AMD), which again corroborate the effectiveness of EpiMC.

摘要

检测单核苷酸多态性(SNP)相互作用对于在全基因组关联研究中识别与复杂人类疾病相关的易感基因至关重要。基于聚类的方法广泛用于减少搜索空间并探索连锁不平衡分析中 SNP 之间的潜在关系。然而,这些方法都只使用单一指标来筛选出非显著的 SNP 组合,而从另一个角度来看,这些组合可能是显著的。在本文中,我们提出了一种两阶段方法,名为 EpiMC(基于多重聚类的连锁不平衡检测),该方法使用多种聚类来获得更精确的候选集,并基于这些集更全面地检测高阶相互作用。在第一阶段,EpiMC 提出了一种基于矩阵分解的多重聚类算法,生成多个不同的聚类,每个聚类将所有 SNP 分为不同的簇。该阶段旨在减少过滤掉单个聚类中可能忽略的潜在候选者的机会,并从不同的聚类角度将相关 SNP 分组在一起。在下一阶段,EpiMC 同时考虑单基因效应和相互作用效应来选择高质量的疾病相关 SNP,然后使用 Jaccard 相似性来获得候选集。最后,EpiMC 在获得的小候选集上进行穷举搜索,以精确检测连锁不平衡。广泛的模拟实验表明,EpiMC 在检测高阶相互作用方面的性能优于最先进的解决方案。在惠康信托基金会病例对照联合会(WTCCC)数据集上,EpiMC 检测到了与乳腺癌(BC)和年龄相关性黄斑变性(AMD)相关的几个显著的连锁不平衡相互作用,这再次证实了 EpiMC 的有效性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验