Department of Mathematics and Gonda brain research institute, Bar-Ilan University, 52900, Ramat-Gan, Israel.
Bioinformatics Research, Center for International Blood and Marrow Transplant Research, Minneapolis, MN, USA.
Immunogenetics. 2019 Nov;71(10):589-604. doi: 10.1007/s00251-019-01144-7. Epub 2019 Nov 18.
The human leukocyte antigen (HLA) is the most polymorphic region in humans. Anthropologists use HLA to trace populations' migration and evolution. However, recent admixture between populations can mask the ancestral haplotype frequency distribution. We present a statistical method based on high-resolution HLA haplotype frequencies to resolve population admixture using a non-negative matrix factorization formalism and validated using haplotype frequencies from 56 world populations. The result is a minimal set of source components (SCs) decoding roughly 90% of the total variance in the studied admixtures. These SCs agree with the geographical distribution, phylogenies, and recent admixture events of the studied groups. With the growing population of multi-ethnic individuals, or individuals that do not report race/ethnic information, the HLA matching process for stem-cell and solid organ transplants is becoming more challenging. The presented algorithm provides a framework that facilitates the breakdown of highly admixed populations into SCs, which can be used to better match the rapidly growing population of multi-ethnic individuals worldwide.
人类白细胞抗原 (HLA) 是人类中多态性最强的区域。人类学家利用 HLA 来追踪种群的迁移和进化。然而,最近的种群混合可能掩盖了祖先单倍型频率的分布。我们提出了一种基于高分辨率 HLA 单倍型频率的统计方法,该方法使用非负矩阵分解形式来解决种群混合问题,并使用来自 56 个世界人群的单倍型频率进行了验证。结果是一组最小的源成分 (SC),可以解码研究混合体中约 90%的总方差。这些 SC 与所研究群体的地理分布、系统发育和最近的混合事件一致。随着多民族个体或不报告种族/民族信息的个体数量的增加,干细胞和实体器官移植的 HLA 匹配过程变得更加具有挑战性。所提出的算法提供了一个框架,将高度混合的人群分解为 SC,这可以用于更好地匹配全球日益增长的多民族个体。