Chung Yujin, Lee Seung Yeoun, Elston Robert C, Park Taesung
Department of Statistics, Seoul National University San 56-1 Shillim-Dong, Kwanak-Gu, Seoul 151-747, Korea.
Bioinformatics. 2007 Jan 1;23(1):71-6. doi: 10.1093/bioinformatics/btl557. Epub 2006 Nov 8.
The identification and characterization of genes that increase the susceptibility to common complex multifactorial diseases is a challenging task in genetic association studies. The multifactor dimensionality reduction (MDR) method has been proposed and implemented by Ritchie et al. (2001) to identify the combinations of multilocus genotypes and discrete environmental factors that are associated with a particular disease. However, the original MDR method classifies the combination of multilocus genotypes into high-risk and low-risk groups in an ad hoc manner based on a simple comparison of the ratios of the number of cases and controls. Hence, the MDR approach is prone to false positive and negative errors when the ratio of the number of cases and controls in a combination of genotypes is similar to that in the entire data, or when both the number of cases and controls is small. Hence, we propose the odds ratio based multifactor dimensionality reduction (OR MDR) method that uses the odds ratio as a new quantitative measure of disease risk.
While the original MDR method provides a simple binary measure of risk, the OR MDR method provides not only the odds ratio as a quantitative measure of risk but also the ordering of the multilocus combinations from the highest risk to lowest risk groups. Furthermore, the OR MDR method provides a confidence interval for the odds ratio for each multilocus combination, which is extremely informative in judging its importance as a risk factor. The proposed OR MDR method is illustrated using the dataset obtained from the CDC Chronic Fatigue Syndrome Research Group.
The program written in R is available.
在基因关联研究中,识别和表征增加常见复杂多因素疾病易感性的基因是一项具有挑战性的任务。Ritchie等人(2001年)提出并实施了多因素降维(MDR)方法,以识别与特定疾病相关的多位点基因型和离散环境因素的组合。然而,原始的MDR方法基于病例数与对照数之比的简单比较,以一种特设的方式将多位点基因型的组合分为高风险和低风险组。因此,当基因型组合中的病例数与对照数之比与整个数据中的比例相似时,或者当病例数和对照数都很小时,MDR方法容易出现假阳性和假阴性错误。因此,我们提出了基于优势比的多因素降维(OR MDR)方法,该方法使用优势比作为疾病风险的新定量指标。
虽然原始的MDR方法提供了一种简单的风险二元度量,但OR MDR方法不仅提供了作为风险定量度量的优势比,还提供了从最高风险组到最低风险组的多位点组合排序。此外,OR MDR方法为每个多位点组合的优势比提供了一个置信区间,这在判断其作为风险因素的重要性时极具信息价值。使用从疾病控制与预防中心慢性疲劳综合征研究组获得的数据集说明了所提出的OR MDR方法。
提供了用R编写的程序。