Liang Xueying, Gao Ying, Lam Tram K, Li Qizhai, Falk Cathy, Yang Xiaohong R, Goldstein Alisa M, Goldin Lynn R
Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, 6120 Executive Boulevard, Bethesda, Maryland 20892, USA.
BMC Proc. 2009 Dec 15;3 Suppl 7(Suppl 7):S79. doi: 10.1186/1753-6561-3-s7-s79.
Although several genes (including a strong effect in the human leukocyte antigen (HLA) region) and some environmental factors have been implicated to cause susceptibility to rheumatoid arthritis (RA), the etiology of the disease is not completely understood. The ability to screen the entire genome for association to complex diseases has great potential for identifying gene effects. However, the efficiency of gene detection in this situation may be improved by methods specifically designed for high-dimensional data. The aim of this study was to compare how three different statistical approaches, multifactor dimensionality reduction (MDR), random forests (RF), and an omnibus approach, worked in identifying gene effects (including gene-gene interaction) associated with RA. We developed a test set of genes based on previous linkage and association findings and tested all three methods. In the presence of the HLA shared-epitope factor, other genes showed weaker effects. All three methods detected SNPs in PTPN22 and TRAF1-C5 as being important. But we did not detect any new genes in this study. We conclude that the three high-dimensional methods are useful as an initial screening for gene associations to identify promising genes for further modeling and additional replication studies.
尽管已有多个基因(包括人类白细胞抗原(HLA)区域的强效应基因)和一些环境因素被认为与类风湿关节炎(RA)的易感性有关,但该疾病的病因尚未完全明确。对整个基因组进行复杂疾病关联筛查的能力对于识别基因效应具有巨大潜力。然而,在这种情况下,通过专门为高维数据设计的方法可以提高基因检测的效率。本研究的目的是比较三种不同的统计方法——多因素降维法(MDR)、随机森林法(RF)和综合法——在识别与RA相关的基因效应(包括基因-基因相互作用)方面的效果。我们基于先前的连锁和关联研究结果开发了一组基因测试集,并对这三种方法进行了测试。在存在HLA共享表位因子的情况下,其他基因的效应较弱。所有三种方法都检测到蛋白酪氨酸磷酸酶非受体型22(PTPN22)和肿瘤坏死因子受体相关因子1-补体成分5(TRAF1-C5)中的单核苷酸多态性(SNP)很重要。但在本研究中我们未检测到任何新基因。我们得出结论,这三种高维方法作为基因关联的初步筛查方法,对于识别有前景的基因以进行进一步建模和额外的重复研究是有用的。