Institute of Computer Science, University of Bialystok, Białystok, Poland.
Institute of Computer Science, University of Bialystok, Białystok, Poland.
Poult Sci. 2020 Dec;99(12):6341-6354. doi: 10.1016/j.psj.2020.08.059. Epub 2020 Sep 12.
Two categories of immune responses-innate and adaptive immunity-have both polygenic backgrounds and a significant environmental component. The goal of the reported study was to define candidate genes and mutations for the immune traits of interest in chickens using machine learning-based sensitivity analysis for single-nucleotide polymorphisms (SNPs) located in candidate genes defined in quantitative trait loci regions. Here the adaptive immunity is represented by the specific antibody response toward keyhole limpet hemocyanin (KLH), whereas the innate immunity was represented by natural antibodies toward lipopolysaccharide (LPS) and lipoteichoic acid (LTA). The analysis consisted of 3 basic steps: an identification of candidate SNPs via feature selection, an optimisation of the feature set using recursive feature elimination, and finally a gene-level sensitivity analysis for final selection of models. The predictive model based on 5 genes (MAPK8IP3 CRLF3, UNC13D, ILR9, and PRCKB) explains 14.9% of variance for KLH adaptive response. The models obtained for LTA and LPS use more genes and have lower predictive power, explaining respectively 7.8 and 4.5% of total variance. In comparison, the linear models built on genes identified by a standard statistical analysis explain 1.5, 0.5, and 0.3% of variance for KLH, LTA, and LPS response, respectively. The present study shows that machine learning methods applied to systems with a complex interaction network can discover phenotype-genotype associations with much higher sensitivity than traditional statistical models. It adds contribution to evidence suggesting a role of MAPK8IP3 in the adaptive immune response. It also indicates that CRLF3 is involved in this process as well. Both findings need additional verification.
两种免疫反应——先天免疫和适应性免疫——都具有多基因背景和重要的环境组成部分。本研究的目的是使用基于机器学习的单核苷酸多态性(SNP)敏感性分析,定义候选基因和与数量性状位点区域中定义的候选基因相关的免疫性状的突变,从而确定候选基因和突变。在这里,适应性免疫由针对血蓝蛋白(KLH)的特异性抗体反应表示,而先天免疫由针对脂多糖(LPS)和脂磷壁酸(LTA)的天然抗体表示。该分析包括 3 个基本步骤:通过特征选择识别候选 SNP,使用递归特征消除优化特征集,最后对基因水平进行敏感性分析,以最终选择模型。基于 5 个基因(MAPK8IP3、CRLF3、UNC13D、ILR9 和 PRCKB)的预测模型可解释 KLH 适应性反应的 14.9%变异。获得的用于 LTA 和 LPS 的模型使用了更多的基因,预测能力较低,分别解释了总方差的 7.8%和 4.5%。相比之下,基于标准统计分析确定的基因构建的线性模型分别可解释 KLH、LTA 和 LPS 反应的 1.5%、0.5%和 0.3%的变异。本研究表明,应用于具有复杂相互作用网络的系统的机器学习方法可以比传统统计模型更敏感地发现表型-基因型关联。它为 MAPK8IP3 在适应性免疫反应中发挥作用的证据提供了补充。它还表明 CRLF3 也参与了这一过程。这两个发现都需要进一步验证。