Bakhtiari Shahab, Sulaimany Sadegh, Talebi Mehrdad, Kalhor Kabmiz
Department of Biological Sciences, University of Kurdistan, Sanandaj, Iran.
Department of Computer Engineering, University of Kurdistan, Sanandaj, Iran.
Cancer Inform. 2020 Jul 15;19:1176935120942216. doi: 10.1177/1176935120942216. eCollection 2020.
Genetic variations such as single nucleotide polymorphisms (SNPs) can cause susceptibility to cancer. Although thousands of genetic variants have been identified to be associated with different cancers, the molecular mechanisms of cancer remain unknown. There is not a particular dataset of relationships between cancer and SNPs, as a bipartite network, for computational analysis and prediction. Link prediction as a computational graph analysis method can help us to gain new insight into the network. In this article, after creating a network between cancer and SNPs using SNPedia and Cancer Research UK databases, we evaluated the computational link prediction methods to foresee new SNP-Cancer relationships. Results show that among the popular scoring methods based on network topology, for relation prediction, the preferential attachment (PA) algorithm is the most robust method according to computational and experimental evidence, and some of its computational predictions are corroborated in recent publications. According to the PA predictions, rs1801394-Non-small cell lung cancer, rs4880-Non-small cell lung cancer, and rs1805794-Colorectal cancer are some of the best probable SNP-Cancer associations that have not yet been mentioned in any published article, and they are the most probable candidates for additional laboratory and validation studies. Also, it is feasible to improve the predicting algorithms to produce new predictions in the future.
单核苷酸多态性(SNP)等基因变异可导致癌症易感性。尽管已鉴定出数千种与不同癌症相关的基因变异,但癌症的分子机制仍然未知。目前还没有一个作为二分网络的癌症与SNP之间关系的特定数据集用于计算分析和预测。链接预测作为一种计算图分析方法,可以帮助我们获得对该网络的新见解。在本文中,我们使用SNPedia和英国癌症研究数据库创建了癌症与SNP之间的网络后,评估了计算链接预测方法以预见新的SNP-癌症关系。结果表明,在基于网络拓扑的流行评分方法中,对于关系预测,根据计算和实验证据,优先连接(PA)算法是最稳健的方法,其一些计算预测在最近的出版物中得到了证实。根据PA预测,rs1801394-非小细胞肺癌、rs4880-非小细胞肺癌和rs1805794-结直肠癌是一些尚未在任何已发表文章中提及的最有可能的SNP-癌症关联,它们是进一步实验室和验证研究的最有可能的候选对象。此外,改进预测算法以在未来产生新的预测是可行的。