Ye Lili, Lin Yongwei, Fan Xing-di, Chen Yaoming, Deng Zengli, Yang Qian, Lei Xiaotian, Mao Jizong, Cui Chunhui
Daycare Chemotherapy Center, Zhujiang Hospital, Southern Medical University, Guangzhou, China.
Department of General Surgery, Zhujiang Hospital, Southern Medical University, Guangzhou, China.
Front Cell Dev Biol. 2021 Jul 26;9:722410. doi: 10.3389/fcell.2021.722410. eCollection 2021.
The patients of Inflammatory bowel disease (IBD) are increasing worldwide. IBD has the characteristics of recurring and difficult to cure, and it is also one of the high-risk factors for colorectal cancer (CRC). The occurrence of IBD is closely related to genetic factors, which prompted us to identify IBD-related genes. Based on the hypothesis that similar diseases are related to similar genes, we purposed a SVM-based method to identify IBD-related genes by disease similarities and gene interactions. One hundred thirty-five diseases which have similarities with IBD and their related genes were obtained. These genes are considered as the candidates of IBD-related genes. We extracted features of each gene and implemented SVM to identify the probability that it is related to IBD. Ten-cross validation was applied to verify the effectiveness of our method. The AUC is 0.93 and AUPR is 0.97, which are the best among four methods. We prioritized the candidate genes and did case studies on top five genes.
炎症性肠病(IBD)患者在全球范围内呈上升趋势。IBD具有复发且难以治愈的特点,也是结直肠癌(CRC)的高危因素之一。IBD的发生与遗传因素密切相关,这促使我们去识别与IBD相关的基因。基于相似疾病与相似基因相关的假设,我们提出了一种基于支持向量机的方法,通过疾病相似性和基因相互作用来识别与IBD相关的基因。我们获得了135种与IBD相似的疾病及其相关基因。这些基因被视为IBD相关基因的候选基因。我们提取了每个基因的特征,并运用支持向量机来识别其与IBD相关的概率。采用十折交叉验证来验证我们方法的有效性。曲线下面积(AUC)为0.93,精确率-召回率曲线下面积(AUPR)为0.97,在四种方法中是最好的。我们对候选基因进行了排序,并对排名前五的基因进行了案例研究。