Zhu Xin, Jin Xin, Liu Jun, Yang Lan, Zou Li-Xin, Li Cai-Xia, Huang Jiang, Jiang Li
Institute of Forensic Medicine, Guizhou Medical University, Guiyang 550004, China.
Key Laboratory of Forensic Genetics, Beijing Engineering Research Center of Crime Scene Evidence Examination, National Engineering Laboratory for Forensic Science, Institute of Forensic Science, Beijing 100038, China.
Yi Chuan. 2024 Feb 20;46(2):149-167. doi: 10.16288/j.yczz.23-260.
The Han populations represent the largest ethnic group in China. Previous studies have primarily focused on investigating their genetic origins, migration and integration, as well as paternal genetic relationships within specific regional Han populations. However, a comprehensive analysis of the global paternal genetic structure of Han populations is lacking. In this study, we performed Y-chromosome sequencing on 362 unrelated male samples from Chinese Han individuals collected from Qinghai, Sichuan and Liaoning provinces. We then integrated relevant data from reported studies. Our final dataset comprised 1830 samples from 16 Han populations across 15 provinces in China, encompassing information on 89 Y-SNPs and 16 Y-STRs. Statistical analyses were conducted to assess Y-STR haplotype diversity (HD) and Y-SNP haplogroup frequencies. Additionally, we employed principal component analysis (PCA), phylogenetic tree and haplotype network to explore genetic differentiation within Han populations and the genetic relationships between Han populations and ethnic minorities surrounding them. Our results demonstrated that the O-M175 haplogroup represents the predominant paternal lineage in Han populations, with frequencies ranging from 60.53% (Qinghai Han) to 92.7% (Guangdong Han). Moreover, the subclades downstream of O-M175 showed distinct regional variations in their distribution patterns. The O2-M122 haplogroup was prevalent in all Han populations and demonstrated a gradual decline in frequency from north to south. Conversely, the distribution frequency of the O1b-M268 haplogroup decreased from south to north, particularly showed significant presence among Han populations in the Lingnan region. Haplogroup O1a-M119 distributed more frequently in the central Han populations. Our findings revealed that Chinese Han populations can be categorized into three subgroups: northern, central, and southern. Notably, there were significant differences among Han in Qinghai and other regions. Regarding the genetic relationships between Han populations and surrounding ethnic minorities, we observed a closer genetic affinity between different Han populations, but northern Han demonstrated a stronger relationship with the Hui ethnic group, while southern Han exhibited a closer connection with the Gelao and Li ethnic groups. In summary, this study presented a systematic analysis of haplogroup distribution, genetic substructure of Han populations and genetic relationships between Han populations and surrounding ethnic minorities based on 89 Y-SNPs and 16 Y-STRs systematically. Our research supplemented valuable insights into population genetics and forensic genetics, and provided data support for the forensic application of Y chromosome. The integration of Y-SNP haplogroups with Y-STR haplotypes offers enhanced understanding of the genetic substructure within Han populations, which holds significance for both population genetics research and forensic science applications.
汉族是中国最大的民族群体。以往的研究主要集中在调查他们的基因起源、迁徙与融合,以及特定地区汉族群体内部的父系基因关系。然而,目前缺乏对汉族群体全球父系基因结构的全面分析。在本研究中,我们对从青海、四川和辽宁省采集的362名无血缘关系的中国汉族男性样本进行了Y染色体测序。然后,我们整合了已发表研究的相关数据。我们的最终数据集包括来自中国15个省份16个汉族群体的1830个样本,涵盖了89个Y-SNP和16个Y-STR的信息。进行了统计分析以评估Y-STR单倍型多样性(HD)和Y-SNP单倍群频率。此外,我们采用主成分分析(PCA)、系统发育树和单倍型网络来探索汉族群体内部的遗传分化以及汉族群体与周边少数民族之间的遗传关系。我们的结果表明,O-M175单倍群是汉族群体中主要的父系谱系,频率范围从60.53%(青海汉族)到92.7%(广东汉族)。此外,O-M175下游的亚分支在分布模式上表现出明显的区域差异。O2-M122单倍群在所有汉族群体中都很普遍,并且频率从北到南逐渐下降。相反,O1b-M268单倍群的分布频率从南到北降低,尤其在岭南地区的汉族群体中显著存在。O1a-M119单倍群在中部汉族群体中分布更为频繁。我们的研究结果表明,中国汉族群体可以分为三个亚组:北方、中部和南方。值得注意的是,青海汉族与其他地区汉族之间存在显著差异。关于汉族群体与周边少数民族之间的遗传关系,我们观察到不同汉族群体之间的遗传亲和力更强,但北方汉族与回族的关系更强,而南方汉族与仡佬族和黎族的联系更紧密。总之,本研究基于89个Y-SNP和16个Y-STR系统地对汉族群体的单倍群分布、遗传亚结构以及汉族群体与周边少数民族之间的遗传关系进行了系统分析。我们的研究为群体遗传学和法医遗传学提供了有价值的见解,并为Y染色体的法医应用提供了数据支持。将Y-SNP单倍群与Y-STR单倍型相结合,有助于加深对汉族群体内部遗传亚结构的理解,这对群体遗传学研究和法医学应用都具有重要意义。