School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an, 710121, Shaanxi, China.
Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an, 710121, Shaanxi, China.
Interdiscip Sci. 2022 Dec;14(4):814-832. doi: 10.1007/s12539-022-00530-2. Epub 2022 Jul 5.
Linear or nonlinear interactions of multiple single-nucleotide polymorphisms (SNPs) play an important role in understanding the genetic basis of complex human diseases. However, combinatorial analytics in high-dimensional space makes it extremely challenging to detect multiorder SNP interactions. Most classic approaches can only perform one task (for detecting k-order SNP interactions) in each run. Since prior knowledge of a complex disease is usually not available, it is difficult to determine the value of k for detecting k-order SNP interactions.
A novel multitasking ant colony optimization algorithm (named MTACO-DMSI) is proposed to detect multiorder SNP interactions, and it is divided into two stages: searching and testing. In the searching stage, multiple multiorder SNP interaction detection tasks (from 2nd-order to kth-order) are executed in parallel, and two subpopulations that separately adopt the Bayesian network-based K2-score and Jensen-Shannon divergence (JS-score) as evaluation criteria are generated for each task to improve the global search capability and the discrimination ability for various disease models. In the testing stage, the G test statistical test is adopted to further verify the authenticity of candidate solutions to reduce the error rate.
Three multiorder simulated disease models with different interaction effects and three real age-related macular degeneration (AMD), rheumatoid arthritis (RA) and type 1 diabetes (T1D) datasets were used to investigate the performance of the proposed MTACO-DMSI. The experimental results show that the MTACO-DMSI has a faster search speed and higher discriminatory power for diverse simulation disease models than traditional single-task algorithms. The results on real AMD data and RA and T1D datasets indicate that MTACO-DMSI has the ability to detect multiorder SNP interactions at a genome-wide scale. Availability and implementation: https://github.com/shouhengtuo/MTACO-DMSI/.
多个单核苷酸多态性(SNP)的线性或非线性相互作用在理解复杂人类疾病的遗传基础方面起着重要作用。然而,高维空间中的组合分析使得检测多阶 SNP 相互作用极具挑战性。大多数经典方法在每次运行中只能执行一项任务(用于检测 k 阶 SNP 相互作用)。由于复杂疾病的先验知识通常不可用,因此很难确定用于检测 k 阶 SNP 相互作用的 k 值。
提出了一种新颖的多任务蚁群优化算法(名为 MTACO-DMSI)来检测多阶 SNP 相互作用,它分为两个阶段:搜索和测试。在搜索阶段,并行执行多个多阶 SNP 相互作用检测任务(从二阶到 k 阶),并为每个任务生成两个分别采用基于贝叶斯网络的 K2 分数和 Jensen-Shannon 散度(JS 分数)作为评估标准的子群体,以提高全局搜索能力和对各种疾病模型的辨别能力。在测试阶段,采用 G 检验统计检验进一步验证候选解决方案的真实性,以降低错误率。
使用三种具有不同相互作用效应的多阶模拟疾病模型和三种真实的年龄相关性黄斑变性(AMD)、类风湿关节炎(RA)和 1 型糖尿病(T1D)数据集来研究所提出的 MTACO-DMSI 的性能。实验结果表明,与传统的单任务算法相比,MTACO-DMSI 对各种模拟疾病模型具有更快的搜索速度和更高的辨别能力。在真实的 AMD 数据和 RA 和 T1D 数据集上的结果表明,MTACO-DMSI 具有在全基因组范围内检测多阶 SNP 相互作用的能力。