Estrada-Gil Jesús K, Fernández-López Juan C, Hernández-Lemus Enrique, Silva-Zolezzi Irma, Hidalgo-Miranda Alfredo, Jiménez-Sánchez Gerardo, Vallejo-Clemente Edgar E
Computer Science Department, Instituto Tecnológico y de Estudios Superiores de Monterrey Campus Estado de Mexico, Mexico.
Bioinformatics. 2007 Jul 1;23(13):i167-74. doi: 10.1093/bioinformatics/btm205.
The identification of risk-associated genetic variants in common diseases remains a challenge to the biomedical research community. It has been suggested that common statistical approaches that exclusively measure main effects are often unable to detect interactions between some of these variants. Detecting and interpreting interactions is a challenging open problem from the statistical and computational perspectives. Methods in computing science may improve our understanding on the mechanisms of genetic disease by detecting interactions even in the presence of very low heritabilities.
We have implemented a method using Genetic Programming that is able to induce a Decision Tree to detect interactions in genetic variants. This method has a cross-validation strategy for estimating classification and prediction errors and tests for consistencies in the results. To have better estimates, a new consistency measure that takes into account interactions and can be used in a genetic programming environment is proposed. This method detected five different interaction models with heritabilities as low as 0.008 and with prediction errors similar to the generated errors.
Information on the generated data sets and executable code is available upon request.
在常见疾病中识别与风险相关的基因变异仍然是生物医学研究界面临的一项挑战。有人提出,仅测量主效应的常见统计方法往往无法检测到其中一些变异之间的相互作用。从统计和计算角度来看,检测和解释相互作用是一个具有挑战性的开放性问题。计算科学中的方法可能通过检测相互作用来增进我们对遗传疾病机制的理解,即使在遗传力非常低的情况下也是如此。
我们实现了一种使用遗传编程的方法,该方法能够诱导决策树来检测基因变异中的相互作用。此方法具有用于估计分类和预测误差的交叉验证策略,并对结果的一致性进行检验。为了获得更好的估计,提出了一种新的一致性度量,该度量考虑了相互作用并且可用于遗传编程环境。该方法检测到五种不同的相互作用模型,其遗传力低至0.008,预测误差与生成的误差相似。
可根据要求提供有关生成的数据集和可执行代码的信息。