Hsu Ling-I, Briggs Farren, Shao Xiaorong, Metayer Catherine, Wiemels Joseph L, Chokkalingam Anand P, Barcellos Lisa F
School of Public Health, University of California, Berkeley, Berkeley, California.
Department of Epidemiology and Biostatistics, School of Medicine, Case Western Reserve University, Cleveland, Ohio.
Cancer Epidemiol Biomarkers Prev. 2016 May;25(5):815-22. doi: 10.1158/1055-9965.EPI-15-0528. Epub 2016 Mar 3.
The incidence of acute lymphoblastic leukemia (ALL) is nearly 20% higher among Hispanics than non-Hispanic Whites. Previous studies have shown evidence for association between risk of ALL and variation within IKZF1, ARID5B, CEBPE, CDKN2A, GATA3, and BM1-PIP4K2A genes. However, variants identified only account for <10% of the genetic risk of ALL.
We applied pathway-based analyses to genome-wide association study (GWAS) data from the California Childhood Leukemia Study to determine whether different biologic pathways were overrepresented in childhood ALL and major ALL subtypes. Furthermore, we applied causal inference and data reduction methods to prioritize candidate genes within each identified overrepresented pathway, while accounting for correlation among SNPs.
Pathway analysis results indicate that different ALL subtypes may involve distinct biologic mechanisms. Focal adhesion is a shared mechanism across the different disease subtypes. For ALL, the top five overrepresented Kyoto Encyclopedia of Genes and Genomes pathways include axon guidance, protein digestion and absorption, melanogenesis, leukocyte transendothelial migration, and focal adhesion (PFDR < 0.05). Notably, these pathways are connected to downstream MAPK or Wnt signaling pathways which have been linked to B-cell malignancies. Several candidate genes for ALL, such as COL6A6 and COL5A1, were identified through targeted maximum likelihood estimation.
This is the first study to show distinct biologic pathways are overrepresented in different ALL subtypes using pathway-based approaches, and identified potential gene candidates using causal inference methods.
The findings demonstrate that newly developed bioinformatics tools and causal inference methods can provide insights to furthering our understanding of the pathogenesis of leukemia. Cancer Epidemiol Biomarkers Prev; 25(5); 815-22. ©2016 AACR.
西班牙裔人群中急性淋巴细胞白血病(ALL)的发病率比非西班牙裔白人高近20%。先前的研究已显示ALL风险与IKZF1、ARID5B、CEBPE、CDKN2A、GATA3和BM1-PIP4K2A基因变异之间存在关联的证据。然而,所鉴定出的变异仅占ALL遗传风险的不到10%。
我们将基于通路的分析应用于加利福尼亚儿童白血病研究的全基因组关联研究(GWAS)数据,以确定不同的生物学通路在儿童ALL及主要ALL亚型中是否过度富集。此外,我们应用因果推断和数据约简方法在每个鉴定出的过度富集通路中对候选基因进行优先级排序,同时考虑单核苷酸多态性(SNP)之间的相关性。
通路分析结果表明,不同的ALL亚型可能涉及不同的生物学机制。粘着斑是不同疾病亚型共有的机制。对于ALL,京都基因与基因组百科全书(KEGG)通路中排名前五位的过度富集通路包括轴突导向、蛋白质消化和吸收、黑色素生成、白细胞跨内皮迁移和粘着斑(错误发现率<0.05)。值得注意的是,这些通路与下游的丝裂原活化蛋白激酶(MAPK)或Wnt信号通路相关联,而这些信号通路已与B细胞恶性肿瘤相关。通过靶向最大似然估计鉴定出了几个ALL的候选基因,如COL6A6和COL5A1。
这是第一项使用基于通路的方法显示不同的生物学通路在不同ALL亚型中过度富集,并使用因果推断方法鉴定潜在基因候选物的研究。
研究结果表明,新开发的生物信息学工具和因果推断方法可为加深我们对白血病发病机制的理解提供见解。《癌症流行病学、生物标志物与预防》;25(5);815 - 22。©2016美国癌症研究协会。