Chen Shyh-Huei, Sun Jielin, Dimitrov Latchezar, Turner Aubrey R, Adams Tamara S, Meyers Deborah A, Chang Bao-Li, Zheng S Lilly, Grönberg Henrik, Xu Jianfeng, Hsu Fang-Chi
Department of Industrial Management, National Yunlin University of Science and Technology, Yunlin, Taiwan.
Genet Epidemiol. 2008 Feb;32(2):152-67. doi: 10.1002/gepi.20272.
Although genetic factors play an important role in most human diseases, multiple genes or genes and environmental factors may influence individual risk. In order to understand the underlying biological mechanisms of complex diseases, it is important to understand the complex relationships that control the process. In this paper, we consider different perspectives, from each optimization, complexity analysis, and algorithmic design, which allows us to describe a reasonable and applicable computational framework for detecting gene-gene interactions. Accordingly, support vector machine and combinatorial optimization techniques (local search and genetic algorithm) were tailored to fit within this framework. Although the proposed approach is computationally expensive, our results indicate this is a promising tool for the identification and characterization of high order gene-gene and gene-environment interactions. We have demonstrated several advantages of this method, including the strong power for classification, less concern for overfitting, and the ability to handle unbalanced data and achieve more stable models. We would like to make the support vector machine and combinatorial optimization techniques more accessible to genetic epidemiologists, and to promote the use and extension of these powerful approaches.
尽管遗传因素在大多数人类疾病中起着重要作用,但多个基因或基因与环境因素可能会影响个体患病风险。为了理解复杂疾病的潜在生物学机制,了解控制该过程的复杂关系很重要。在本文中,我们从各个优化、复杂性分析和算法设计等不同角度进行考虑,这使我们能够描述一个合理且适用的用于检测基因 - 基因相互作用的计算框架。相应地,支持向量机和组合优化技术(局部搜索和遗传算法)经过调整以适应此框架。尽管所提出的方法计算成本高昂,但我们的结果表明这是一种用于识别和表征高阶基因 - 基因以及基因 - 环境相互作用的有前途的工具。我们已经证明了该方法的几个优点,包括强大的分类能力、对过拟合的担忧较少,以及处理不平衡数据并实现更稳定模型的能力。我们希望让遗传流行病学家更容易使用支持向量机和组合优化技术,并促进这些强大方法的应用和扩展。