Yang Cheng-Hong, Hou Ming-Feng, Chuang Li-Yeh, Yang Cheng-San, Lin Yu-Da
Department of Information Management at the Tainan University of Technology, and at the Department of Electronic Engineering at National Kaohsiung of Science and Technology, Taiwan.
Biomedical Engineering, Kaohsiung Medical University, Taiwan.
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac512.
In epistasis analysis, single-nucleotide polymorphism-single-nucleotide polymorphism interactions (SSIs) among genes may, alongside other environmental factors, influence the risk of multifactorial diseases. To identify SSI between cases and controls (i.e. binary traits), the score for model quality is affected by different objective functions (i.e. measurements) because of potential disease model preferences and disease complexities. Our previous study proposed a multiobjective approach-based multifactor dimensionality reduction (MOMDR), with the results indicating that two objective functions could enhance SSI identification with weak marginal effects. However, SSI identification using MOMDR remains a challenge because the optimal measure combination of objective functions has yet to be investigated. This study extended MOMDR to the many-objective version (i.e. many-objective MDR, MaODR) by integrating various disease probability measures based on a two-way contingency table to improve the identification of SSI between cases and controls. We introduced an objective function selection approach to determine the optimal measure combination in MaODR among 10 well-known measures. In total, 6 disease models with and 40 disease models without marginal effects were used to evaluate the general algorithms, namely those based on multifactor dimensionality reduction, MOMDR and MaODR. Our results revealed that the MaODR-based three objective function model, correct classification rate, likelihood ratio and normalized mutual information (MaODR-CLN) exhibited the higher 6.47% detection success rates (Accuracy) than MOMDR and higher 17.23% detection success rates than MDR through the application of an objective function selection approach. In a Wellcome Trust Case Control Consortium, MaODR-CLN successfully identified the significant SSIs (P < 0.001) associated with coronary artery disease. We performed a systematic analysis to identify the optimal measure combination in MaODR among 10 objective functions. Our combination detected SSIs-based binary traits with weak marginal effects and thus reduced spurious variables in the score model. MOAI is freely available at https://sites.google.com/view/maodr/home.
在上位性分析中,基因间的单核苷酸多态性-单核苷酸多态性相互作用(SSIs)可能与其他环境因素一起影响多因素疾病的风险。为了识别病例与对照之间的SSI(即二元性状),由于潜在的疾病模型偏好和疾病复杂性,模型质量得分会受到不同目标函数(即测量指标)的影响。我们之前的研究提出了一种基于多目标方法的多因素降维(MOMDR),结果表明两个目标函数可以在边际效应较弱的情况下增强SSI识别。然而,使用MOMDR进行SSI识别仍然是一个挑战,因为目标函数的最优测量指标组合尚未得到研究。本研究通过基于双向列联表整合各种疾病概率测量指标,将MOMDR扩展为多目标版本(即多目标MDR,MaODR),以改善病例与对照之间SSI的识别。我们引入了一种目标函数选择方法来确定MaODR中10种著名测量指标的最优指标组合。总共使用了6种有边际效应和40种无边际效应的疾病模型来评估一般算法,即基于多因素降维、MOMDR和MaODR的算法。我们的结果显示,基于MaODR的三目标函数模型、正确分类率、似然比和归一化互信息(MaODR-CLN)通过应用目标函数选择方法,其检测成功率(准确率)比MOMDR高6.47%,比MDR高17.23%。在威康信托病例对照研究联盟中,MaODR-CLN成功识别出了与冠状动脉疾病相关的显著SSI(P < 0.001)。我们进行了系统分析,以确定MaODR中10种目标函数的最优指标组合。我们的组合检测到了具有弱边际效应的基于SSI的二元性状,从而减少了得分模型中的虚假变量。MOAI可在https://sites.google.com/view/maodr/home免费获取。