Lee Young-Sup, Won KyeongHye, Shin Donghyun, Oh Jae-Don
Department of Animal Biotechnology, Jeonbuk National University, Jeonju, Republic of Korea.
The Animal Molecular Genetics and Breeding Center, Jeonbuk National University, Jeonju, Republic of Korea.
Anim Cells Syst (Seoul). 2020 Dec 24;24(6):321-328. doi: 10.1080/19768354.2020.1860125.
Despite the various existing studies about nonsynonymous single nucleotide polymorphisms (nsSNPs), genome-wide studies based on nsSNPs are rare. NsSNPs alter amino acid sequences, affect protein structure and function, and have deleterious effects. By predicting the deleterious effect of nsSNPs, we determined the total risk score per individual. Additionally, the machine learning technique was utilized to find an optimal nsSNP subset that best explains the complete nsSNP effect. A total of 16,100 nsSNPs were selected as the best representatives among 89,519 regressed nsSNPs. In the gene ontology analysis encompassing the 16,100 nsSNPs, DNA metabolic process, chemokine- and immune-related, and reproduction were the most enriched terms. We expect that our risk score prediction and nsSNP marker selection will contribute to future development of extant genome-wide association studies and breeding science more broadly.
尽管已有多项关于非同义单核苷酸多态性(nsSNPs)的研究,但基于nsSNPs的全基因组研究却很少见。NsSNPs会改变氨基酸序列,影响蛋白质结构和功能,并具有有害作用。通过预测nsSNPs的有害影响,我们确定了每个个体的总风险评分。此外,利用机器学习技术找到了一个最佳的nsSNP子集,该子集能最好地解释完整的nsSNP效应。在89,519个回归的nsSNPs中,总共选择了16,100个nsSNPs作为最佳代表。在包含这16,100个nsSNPs的基因本体分析中,DNA代谢过程、趋化因子和免疫相关以及生殖是最富集的术语。我们期望我们的风险评分预测和nsSNP标记选择将更广泛地促进现有全基因组关联研究和育种科学的未来发展。