Suppr超能文献

基于信息熵的方法在复杂疾病的基因-基因和基因-环境相互作用/关联检测和特征分析中的应用。

Entropy-based information gain approaches to detect and to characterize gene-gene and gene-environment interactions/correlations of complex diseases.

机构信息

Department of Statistics, Texas A&M University, College Station, Texas 77843, USA.

出版信息

Genet Epidemiol. 2011 Nov;35(7):706-21. doi: 10.1002/gepi.20621.

Abstract

For complex diseases, the relationship between genotypes, environment factors, and phenotype is usually complex and nonlinear. Our understanding of the genetic architecture of diseases has considerably increased over the last years. However, both conceptually and methodologically, detecting gene-gene and gene-environment interactions remains a challenge, despite the existence of a number of efficient methods. One method that offers great promises but has not yet been widely applied to genomic data is the entropy-based approach of information theory. In this article, we first develop entropy-based test statistics to identify two-way and higher order gene-gene and gene-environment interactions. We then apply these methods to a bladder cancer data set and thereby test their power and identify strengths and weaknesses. For two-way interactions, we propose an information gain (IG) approach based on mutual information. For three-ways and higher order interactions, an interaction IG approach is used. In both cases, we develop one-dimensional test statistics to analyze sparse data. Compared to the naive chi-square test, the test statistics we develop have similar or higher power and is robust. Applying it to the bladder cancer data set allowed to investigate the complex interactions between DNA repair gene single nucleotide polymorphisms, smoking status, and bladder cancer susceptibility. Although not yet widely applied, entropy-based approaches appear as a useful tool for detecting gene-gene and gene-environment interactions. The test statistics we develop add to a growing body methodologies that will gradually shed light on the complex architecture of common diseases.

摘要

对于复杂疾病,基因型、环境因素和表型之间的关系通常是复杂且非线性的。近年来,我们对疾病遗传结构的理解有了很大的提高。然而,尽管存在许多有效的方法,从概念和方法上检测基因-基因和基因-环境相互作用仍然是一个挑战。一种提供了很大希望但尚未广泛应用于基因组数据的方法是基于信息论的熵方法。在本文中,我们首先开发了基于熵的测试统计量来识别双向和更高阶的基因-基因和基因-环境相互作用。然后,我们将这些方法应用于膀胱癌数据集,以检验它们的功效并确定其优势和劣势。对于双向相互作用,我们提出了一种基于互信息的信息增益(IG)方法。对于三阶及更高阶的相互作用,使用了交互 IG 方法。在这两种情况下,我们都开发了一维测试统计量来分析稀疏数据。与简单的卡方检验相比,我们开发的测试统计量具有相似或更高的功效且稳健。将其应用于膀胱癌数据集,使我们能够研究 DNA 修复基因单核苷酸多态性、吸烟状况和膀胱癌易感性之间的复杂相互作用。尽管尚未广泛应用,但基于熵的方法似乎是一种用于检测基因-基因和基因-环境相互作用的有用工具。我们开发的测试统计量增加了不断增长的方法学,这些方法学将逐渐揭示常见疾病的复杂结构。

相似文献

9
Mutual information for testing gene-environment interaction.用于检测基因-环境相互作用的互信息
PLoS One. 2009;4(2):e4578. doi: 10.1371/journal.pone.0004578. Epub 2009 Feb 24.

引用本文的文献

1
High-Dimensional Gene-Environment Interaction Analysis.高维基因-环境相互作用分析
Annu Rev Stat Appl. 2025 Mar;12. doi: 10.1146/annurev-statistics-112723-034315. Epub 2024 Sep 11.
6
Unified model-free interaction screening via CV-entropy filter.通过CV熵滤波器进行统一的无模型相互作用筛选。
Comput Stat Data Anal. 2023 Apr;180. doi: 10.1016/j.csda.2022.107684. Epub 2022 Dec 28.

本文引用的文献

6
Epistasis and its implications for personal genetics.上位效应及其对个人遗传学的影响。
Am J Hum Genet. 2009 Sep;85(3):309-20. doi: 10.1016/j.ajhg.2009.08.006.
7
Mutual information for testing gene-environment interaction.用于检测基因-环境相互作用的互信息
PLoS One. 2009;4(2):e4578. doi: 10.1371/journal.pone.0004578. Epub 2009 Feb 24.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验