Che Kai, Chen Xi, Guo Maozu, Wang Chunyu, Liu Xiaoyan
School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China.
School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China.
Front Genet. 2020 Mar 3;11:155. doi: 10.3389/fgene.2020.00155. eCollection 2020.
Identification of genetic variants associated with complex traits is a critical step for improving plant resistance and breeding. Although the majority of existing methods for variants detection have good predictive performance in the average case, they can not precisely identify the variants present in a small number of target genes. In this paper, we propose a weighted sparse group lasso (WSGL) method to select both common and low-frequency variants in groups. Under the biologically realistic assumption that complex traits are influenced by a few single loci in a small number of genes, our method involves a sparse group lasso approach to simultaneously select associated groups along with the loci within each group. To increase the probability of selecting out low-frequency variants, biological prior information is introduced in the model by re-weighting lasso regularization based on weights calculated from input data. Experimental results from both simulation and real data of single nucleotide polymorphisms (SNPs) associated with flowering traits demonstrate the superiority of WSGL over other competitive approaches for genetic variants detection.
鉴定与复杂性状相关的基因变异是提高植物抗性和育种的关键步骤。尽管大多数现有的变异检测方法在一般情况下具有良好的预测性能,但它们无法精确识别少数目标基因中存在的变异。在本文中,我们提出了一种加权稀疏组套索(WSGL)方法,用于在组中选择常见和低频变异。在复杂性状受少数基因中的少数单个位点影响这一生物学现实假设下,我们的方法采用稀疏组套索方法,同时选择相关组以及每组内的位点。为了提高选出低频变异的概率,通过基于从输入数据计算出的权重对套索正则化进行重新加权,在模型中引入生物学先验信息。与开花性状相关的单核苷酸多态性(SNP)的模拟和真实数据的实验结果表明,WSGL在基因变异检测方面优于其他竞争方法。