Hu Ting, Andrew Angeline S, Karagas Margaret R, Moore Jason H
Department of Genetics, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755, USA.
Pac Symp Biocomput. 2013:397-408.
The rapid development of sequencing technologies makes thousands to millions of genetic attributes available for testing associations with various biological traits. Searching this enormous high-dimensional data space imposes a great computational challenge in genome-wide association studies. We introduce a network-based approach to supervise the search for three-locus models of disease susceptibility. Such statistical epistasis networks (SEN) are built using strong pairwise epistatic interactions and provide a global interaction map to search for higher-order interactions by prioritizing genetic attributes clustered together in the networks. Applying this approach to a population-based bladder cancer dataset, we found a high susceptibility three-way model of genetic variations in DNA repair and immune regulation pathways, which holds great potential for studying the etiology of bladder cancer with further biological validations. We demonstrate that our SEN-supervised search is able to find a small subset of three-locus models with significantly high associations at a substantially reduced computational cost.
测序技术的快速发展使得数以千计到数百万计的遗传属性可用于测试与各种生物学性状的关联。在全基因组关联研究中,搜索这个巨大的高维数据空间带来了巨大的计算挑战。我们引入一种基于网络的方法来指导对疾病易感性三基因座模型的搜索。这种统计上位性网络(SEN)利用强大的成对上位性相互作用构建,并提供一个全局相互作用图谱,通过对网络中聚集在一起的遗传属性进行优先级排序来搜索高阶相互作用。将这种方法应用于一个基于人群的膀胱癌数据集,我们发现了一个DNA修复和免疫调节途径中遗传变异的高易感性三基因座模型,通过进一步的生物学验证,该模型在研究膀胱癌病因方面具有巨大潜力。我们证明,我们的SEN指导搜索能够以显著降低的计算成本找到一小部分具有显著高关联性的三基因座模型。