School of Computer and Information Science, Hunan Institute of Technology, Hengyang 412002, China.
College of Information Science and Engineering, Hunan University, Changsha, Hunan 410082, China.
Comput Math Methods Med. 2022 Jun 16;2022:3837579. doi: 10.1155/2022/3837579. eCollection 2022.
Single-nucleotide polymorphism (SNP) involves the replacement of a single nucleotide in a deoxyribonucleic acid (DNA) sequence and is often linked to the development of specific diseases. Although current genotyping methods can tag SNP loci within biological samples to provide accurate genetic information for a disease associated, they have limited prediction accuracy. Furthermore, they are complex to perform and may result in the prediction of an excessive number of tag SNP loci, which may not always be associated with the disease. Therefore in this manuscript, we aimed to evaluate the impact of a newly optimized fuzzy clustering and binary particle swarm optimization algorithm (FCBPSO) on the accuracy and running time of informative SNP selection. Fuzzy clustering and FCBPSO were first applied to identify the equivalence relation and the candidate tag SNP set to reduce the redundancy between loci. The FCBPSO algorithm was then optimized and used to obtain the final tag SNP set. The prediction performance and running time of the newly developed model were compared with other traditional methods, including NMC, SPSO, and MCMR. The prediction accuracy of the FCBPSO algorithm was always higher than that of the other algorithms especially as the number of tag SNPs increased. However, when the number of tag SNPs was low, the prediction accuracy of FCBPSO was slightly lower than that of MCMR (add prediction accuracy values for each algorithm). However, the running time of the FCBPSO algorithm was always lower than that of MCMR. FCBPSO not only reduced the size and dimension of the optimization problem but also simplified the training of the prediction model. This improved the prediction accuracy of the model and reduced the running time when compared with other traditional methods.
单核苷酸多态性(SNP)涉及脱氧核糖核酸(DNA)序列中单核苷酸的替换,通常与特定疾病的发展有关。虽然目前的基因分型方法可以标记生物样本中的 SNP 位点,为相关疾病提供准确的遗传信息,但它们的预测准确性有限。此外,这些方法操作复杂,可能会预测出过多的标记 SNP 位点,而这些 SNP 位点并不总是与疾病相关。因此,在本研究中,我们旨在评估新优化的模糊聚类和二进制粒子群优化算法(FCBPSO)对信息 SNP 选择的准确性和运行时间的影响。首先应用模糊聚类和 FCBPSO 来识别等价关系和候选标记 SNP 集,以减少位点之间的冗余。然后优化 FCBPSO 算法以获得最终的标记 SNP 集。与其他传统方法(包括 NMC、SPSO 和 MCMR)相比,比较了新开发模型的预测性能和运行时间。FCBPSO 算法的预测准确性始终高于其他算法,尤其是随着标记 SNP 数量的增加。然而,当标记 SNP 数量较低时,FCBPSO 的预测准确性略低于 MCMR(为每种算法添加预测准确性值)。然而,FCBPSO 算法的运行时间始终低于 MCMR。FCBPSO 不仅减少了优化问题的规模和维度,还简化了预测模型的训练。与其他传统方法相比,这提高了模型的预测准确性并缩短了运行时间。