Weise Jessica A, Ng Jillian, Oldt Robert F, Viray Joy, McCulloh Kelly L, Smith David Glenn, Kanthaswamy Sreetharan
Forensic Science Graduate Program, University of California, Davis, CA, USA.
Molecular Anthropology Laboratory, Department of Anthropology, University of California, Davis, CA, USA.
Forensic Sci Res. 2021 Sep 20;7(4):662-672. doi: 10.1080/20961790.2021.1963088. eCollection 2022.
The National Research Council recommends that genetic differentiation among subgroups of ethnic samples be lower than 3% of the total genetic differentiation within the ethnic sample to be used for estimating reliable random match probabilities for forensic use. Native American samples in the United States' Combined DNA Index System (CODIS) database represent four language families: Algonquian, Na-Dene, Eskimo-Aleut, and Salishan. However, a minimum of 27 Native American language families exists in the US, not including language isolates. Our goal was to ascertain whether genetic differences are correlated with language groupings and, if so, whether additional language families would provide a more accurate representation of current genetic diversity among tribal populations. The 21 short tandem repeat (STR) loci included in the Globalfiler® PCR Amplification Kit were used to characterize six indigenous language families, including three of the four represented in the CODIS database (i.e. Algonquian, Na-Dene, and Eskimo-Aleut), and two language isolates (Miwok and Seri) using major population genetic diversity metrics such as F statistics and Bayesian clustering analysis of genotype frequencies. Most of the genetic variation (97%) was found to be within language families instead of among them (3%). In contrast, when only the three of the four language families represented in both the CODIS database and the present study were considered, 4% of the genetic variation occurred among the language groups. Bayesian clustering resulted in a maximum posterior probability indicating three genetically distinct groups among the eight language families and isolates: (1) Eskimo, (2) Seri, and (3) all other language groups and isolates, thus confirming genetic subdivision among subgroups of the CODIS Native American database. This genetic structure indicates the need for an increased number of Native American populations based on language affiliation in the CODIS database as well as more robust sample sets for those language families. Supplemental data for this article is available online at https://doi.org/10.1080/20961790.2021.1963088 .
美国国家研究委员会建议,用于法医用途估计可靠随机匹配概率的种族样本亚组间的遗传分化应低于该种族样本内总遗传分化的3%。美国联合DNA索引系统(CODIS)数据库中的美洲原住民样本代表四个语系:阿尔冈昆语系、纳德内语系、爱斯基摩-阿留申语系和萨利希语系。然而,美国至少存在27个美洲原住民语系,不包括孤立语言。我们的目标是确定遗传差异是否与语系分组相关,如果是,额外的语系是否能更准确地反映部落人群当前的遗传多样性。使用Globalfiler® PCR扩增试剂盒中包含的21个短串联重复序列(STR)位点,通过F统计量和基因型频率的贝叶斯聚类分析等主要群体遗传多样性指标,对六个本土语系进行特征分析,其中包括CODIS数据库中四个语系中的三个(即阿尔冈昆语系、纳德内语系和爱斯基摩-阿留申语系),以及两种孤立语言(米沃克语和塞里语)。研究发现,大部分遗传变异(97%)存在于语系内部而非语系之间(3%)。相比之下,当仅考虑CODIS数据库和本研究中都出现的四个语系中的三个时,4%的遗传变异出现在语系群体之间。贝叶斯聚类分析得出的最大后验概率表明,八个语系和孤立语言可分为三个遗传上不同的群体:(1)爱斯基摩语系,(2)塞里语,(3)所有其他语系和孤立语言,从而证实了CODIS美洲原住民数据库亚组间的遗传细分。这种遗传结构表明,CODIS数据库中需要增加基于语言归属的美洲原住民群体数量,以及为那些语系提供更强大的样本集。本文的补充数据可在https://doi.org/10.1080/20961790.2021.1963088在线获取。