Che Jingmin, Shin Miyoung
Bio-Intelligence & Data Mining Lab, School of Electronics Engineering, Kyungpook National University, 1370 Sankyuk-dong, Buk-gu, Daegu 702-701, Republic of Korea.
Biomed Res Int. 2015;2015:576349. doi: 10.1155/2015/576349. Epub 2015 Mar 22.
In order to understand disease pathogenesis, improve medical diagnosis, or discover effective drug targets, it is important to identify significant genes deeply involved in human disease. For this purpose, many earlier approaches attempted to prioritize candidate genes using gene expression profiles or SNP genotype data, but they often suffer from producing many false-positive results. To address this issue, in this paper, we propose a meta-analysis strategy for gene prioritization that employs three different genetic resources--gene expression data, single nucleotide polymorphism (SNP) genotype data, and expression quantitative trait loci (eQTL) data--in an integrative manner. For integration, we utilized an improved technique for the order of preference by similarity to ideal solution (TOPSIS) to combine scores from distinct resources. This method was evaluated on two publicly available datasets regarding prostate cancer and lung cancer to identify disease-related genes. Consequently, our proposed strategy for gene prioritization showed its superiority to conventional methods in discovering significant disease-related genes with several types of genetic resources, while making good use of potential complementarities among available resources.
为了理解疾病发病机制、改善医学诊断或发现有效的药物靶点,识别深度参与人类疾病的重要基因至关重要。为此,许多早期方法试图利用基因表达谱或单核苷酸多态性(SNP)基因型数据对候选基因进行优先级排序,但它们常常产生许多假阳性结果。为了解决这个问题,在本文中,我们提出了一种用于基因优先级排序的荟萃分析策略,该策略以综合方式利用三种不同的遗传资源——基因表达数据、单核苷酸多态性(SNP)基因型数据和表达定量性状位点(eQTL)数据。为了进行整合,我们利用一种改进的技术——逼近理想解排序法(TOPSIS)来组合来自不同资源的分数。该方法在两个关于前列腺癌和肺癌的公开可用数据集上进行了评估,以识别疾病相关基因。因此,我们提出的基因优先级排序策略在利用多种类型的遗传资源发现重要的疾病相关基因方面显示出优于传统方法的优势,同时充分利用了可用资源之间潜在的互补性。