Suppr超能文献

不同种族群体之间的基因差异。

Genetic differences among ethnic groups.

作者信息

Huang Tao, Shu Yang, Cai Yu-Dong

机构信息

Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, P. R. China.

Sate Key Laboratory of Biotherapy, Sichuan University, Sichuan, 610041, P. R. China.

出版信息

BMC Genomics. 2015 Dec 21;16:1093. doi: 10.1186/s12864-015-2328-0.

Abstract

BACKGROUND

Many differences between different ethnic groups have been observed, such as skin color, eye color, height, susceptibility to some diseases, and response to certain drugs. However, the genetic bases of such differences have been under-investigated. Since the HapMap project, large-scale genotype data from Caucasian, African and Asian population samples have been available. The project found that these populations were located in different areas of the PCA (Principal Component Analysis) plot. However, as an unsupervised method, PCA does not measure the differences in each single nucleotide polymorphism (SNP) among populations.

RESULTS

We applied an advanced mutual information-based feature selection method to detect associations between SNP status and ethnic groups using the latest HapMap Phase 3 release version 3, which included more sub-populations. A total of 299 SNPs were identified, and they can accurately predicted the ethnicity of all HapMap populations. The 10-fold cross validation accuracy of the SMO (sequential minimal optimization) model on training dataset was 0.901, and the accuracy on independent test dataset was 0.895.

CONCLUSIONS

In-depth functional analysis of these SNPs and their nearby genes revealed the genetic bases of skin and eye color differences among populations.

摘要

背景

已观察到不同种族群体之间存在许多差异,如肤色、眼睛颜色、身高、对某些疾病的易感性以及对某些药物的反应。然而,此类差异的遗传基础尚未得到充分研究。自国际人类基因组单体型图计划(HapMap计划)以来,已有来自高加索、非洲和亚洲人群样本的大规模基因型数据。该计划发现,这些人群位于主成分分析(PCA)图的不同区域。然而,作为一种无监督方法,PCA并未测量人群中每个单核苷酸多态性(SNP)的差异。

结果

我们应用一种基于互信息的先进特征选择方法,利用最新的HapMap第3阶段第3版(其中包含更多亚群体)检测SNP状态与种族群体之间的关联。共鉴定出299个SNP,它们能够准确预测所有HapMap群体的种族。顺序最小优化(SMO)模型在训练数据集上的10倍交叉验证准确率为0.901,在独立测试数据集上的准确率为0.895。

结论

对这些SNP及其附近基因进行深入的功能分析,揭示了人群中肤色和眼睛颜色差异的遗传基础。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d87b/4687076/15d79bbabbb2/12864_2015_2328_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验