Soheila Khodakarim, Hamid Alavimajd, Farid Zayeri, Mostafa Rezaee-Tavirani, Nasrin Dehghan-Nayeri, Syyed-Mohammad Tabatabaee, Vahide Tajalli
Department of Epidemiology, Faculty of Public Health, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
Asian Pac J Cancer Prev. 2013;14(3):1629-33. doi: 10.7314/apjcp.2013.14.3.1629.
Gene set analysis (GSA) incorporates biological with statistical knowledge to identify gene sets which are differentially expressed that between two or more phenotypes.
In this paper gene sets differentially expressed between acute lymphoblastic leukaemia (ALL) with BCR-ABL and those with no observed cytogenetic abnormalities were determined by GSA methods. The BCR-ABL is an abnormal gene found in some people with ALL.
The results of two GSAs showed that the Category test identified 30 gene sets differentially expressed between two phenotypes, while the Hotelling's T2 could discover just 19 gene sets. On the other hand, assessment of common genes among significant gene sets showed that there were high agreement between the results of GSA and the findings of biologists. In addition, the performance of these methods was compared by simulated and ALL data.
The results on simulated data indicated decrease in the type I error rate and increase the power in multivariate (Hotelling's T2) test as increasing the correlation between gene pairs in contrast to the univariate (Category) test.
基因集分析(GSA)将生物学知识与统计知识相结合,以识别在两种或更多种表型之间差异表达的基因集。
本文采用GSA方法确定了伴有BCR-ABL的急性淋巴细胞白血病(ALL)与未观察到细胞遗传学异常的ALL之间差异表达的基因集。BCR-ABL是在一些ALL患者中发现的异常基因。
两种GSA的结果显示,类别检验识别出30个在两种表型之间差异表达的基因集,而霍特林T2检验仅能发现19个基因集。另一方面,对显著基因集中的共同基因进行评估表明,GSA的结果与生物学家的发现高度一致。此外,通过模拟数据和ALL数据对这些方法的性能进行了比较。
模拟数据的结果表明,与单变量(类别)检验相比,随着基因对之间相关性增加多变量(霍特林T2)检验的I型错误率降低且检验效能增加。