Suppr超能文献

基于遗传集成的基因-基因交互作用识别方法。

A genetic ensemble approach for gene-gene interaction identification.

机构信息

School of Information Technologies, University of Sydney, NSW 2006, Australia.

出版信息

BMC Bioinformatics. 2010 Oct 21;11:524. doi: 10.1186/1471-2105-11-524.

Abstract

BACKGROUND

It has now become clear that gene-gene interactions and gene-environment interactions are ubiquitous and fundamental mechanisms for the development of complex diseases. Though a considerable effort has been put into developing statistical models and algorithmic strategies for identifying such interactions, the accurate identification of those genetic interactions has been proven to be very challenging.

METHODS

In this paper, we propose a new approach for identifying such gene-gene and gene-environment interactions underlying complex diseases. This is a hybrid algorithm and it combines genetic algorithm (GA) and an ensemble of classifiers (called genetic ensemble). Using this approach, the original problem of SNP interaction identification is converted into a data mining problem of combinatorial feature selection. By collecting various single nucleotide polymorphisms (SNP) subsets as well as environmental factors generated in multiple GA runs, patterns of gene-gene and gene-environment interactions can be extracted using a simple combinatorial ranking method. Also considered in this study is the idea of combining identification results obtained from multiple algorithms. A novel formula based on pairwise double fault is designed to quantify the degree of complementarity.

CONCLUSIONS

Our simulation study demonstrates that the proposed genetic ensemble algorithm has comparable identification power to Multifactor Dimensionality Reduction (MDR) and is slightly better than Polymorphism Interaction Analysis (PIA), which are the two most popular methods for gene-gene interaction identification. More importantly, the identification results generated by using our genetic ensemble algorithm are highly complementary to those obtained by PIA and MDR. Experimental results from our simulation studies and real world data application also confirm the effectiveness of the proposed genetic ensemble algorithm, as well as the potential benefits of combining identification results from different algorithms.

摘要

背景

现在已经很清楚,基因-基因相互作用和基因-环境相互作用是复杂疾病发展的普遍而基本的机制。尽管已经投入了相当大的努力来开发用于识别这些相互作用的统计模型和算法策略,但事实证明,准确识别这些遗传相互作用非常具有挑战性。

方法

在本文中,我们提出了一种用于识别复杂疾病中基因-基因和基因-环境相互作用的新方法。这是一种混合算法,它结合了遗传算法(GA)和一组分类器(称为遗传集成)。使用这种方法,将 SNP 相互作用识别的原始问题转换为组合特征选择的数据挖掘问题。通过收集在多个 GA 运行中生成的各种单核苷酸多态性(SNP)子集以及环境因素,可以使用简单的组合排序方法提取基因-基因和基因-环境相互作用的模式。在这项研究中还考虑了结合来自多个算法的识别结果的想法。设计了一种基于成对双重故障的新公式来量化互补程度。

结论

我们的模拟研究表明,所提出的遗传集成算法具有与多因子降维(MDR)相当的识别能力,略优于基因-基因相互作用识别的两种最流行方法,即多态性相互作用分析(PIA)。更重要的是,使用我们的遗传集成算法生成的识别结果与 PIA 和 MDR 获得的结果高度互补。来自我们的模拟研究和真实世界数据应用的实验结果也证实了所提出的遗传集成算法的有效性,以及结合来自不同算法的识别结果的潜在好处。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd7a/2973963/7b59e7f79de4/1471-2105-11-524-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验