Suppr超能文献

分泌型黏蛋白 MUC5AC 和 MUC5B 的结构和遗传多样性。

Structural and genetic diversity in the secreted mucins MUC5AC and MUC5B.

机构信息

Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA; Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA.

Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University, Moorenstr. 5, 40225 Düsseldorf, Germany; Center for Digital Medicine, Heinrich Heine University, Moorenstr. 5, 40225 Düsseldorf, Germany.

出版信息

Am J Hum Genet. 2024 Aug 8;111(8):1700-1716. doi: 10.1016/j.ajhg.2024.06.007. Epub 2024 Jul 10.

Abstract

The secreted mucins MUC5AC and MUC5B are large glycoproteins that play critical defensive roles in pathogen entrapment and mucociliary clearance. Their respective genes contain polymorphic and degenerate protein-coding variable number tandem repeats (VNTRs) that make the loci difficult to investigate with short reads. We characterize the structural diversity of MUC5AC and MUC5B by long-read sequencing and assembly of 206 human and 20 nonhuman primate (NHP) haplotypes. We find that human MUC5B is largely invariant (5,761-5,762 amino acids [aa]); however, seven haplotypes have expanded VNTRs (6,291-7,019 aa). In contrast, 30 allelic variants of MUC5AC encode 16 distinct proteins (5,249-6,325 aa) with cysteine-rich domain and VNTR copy-number variation. We group MUC5AC alleles into three phylogenetic clades: H1 (46%, ∼5,654 aa), H2 (33%, ∼5,742 aa), and H3 (7%, ∼6,325 aa). The two most common human MUC5AC variants are smaller than NHP gene models, suggesting a reduction in protein length during recent human evolution. Linkage disequilibrium and Tajima's D analyses reveal that East Asians carry exceptionally large blocks with an excess of rare variation (p < 0.05) at MUC5AC. To validate this result, we use Locityper for genotyping MUC5AC haplogroups in 2,600 unrelated samples from the 1000 Genomes Project. We observe a signature of positive selection in H1 among East Asians and a depletion of the likely ancestral haplogroup (H3). In Europeans, H3 alleles show an excess of common variation and deviate from Hardy-Weinberg equilibrium (p < 0.05), consistent with heterozygote advantage and balancing selection. This study provides a generalizable strategy to characterize complex protein-coding VNTRs for improved disease associations.

摘要

分泌型粘蛋白 MUC5AC 和 MUC5B 是两种重要的糖蛋白,在病原体捕获和黏液纤毛清除中发挥关键的防御作用。它们各自的基因包含多态性和退化的蛋白编码可变数串联重复(VNTR),这使得这些基因座难以使用短读长进行研究。我们通过对 206 个人类和 20 种非人类灵长类动物(NHP)的单倍型进行长读测序和组装,对 MUC5AC 和 MUC5B 的结构多样性进行了描述。我们发现人类 MUC5B 基本不变(5761-5762 个氨基酸[aa]);然而,有七个单倍型具有扩展的 VNTR(6291-7019 aa)。相比之下,MUC5AC 的 30 个等位基因变体编码 16 种不同的蛋白(5249-6325 aa),具有富含半胱氨酸的结构域和 VNTR 拷贝数的变化。我们将 MUC5AC 等位基因分为三个系统发育枝:H1(46%,约 5654 aa)、H2(33%,约 5742 aa)和 H3(7%,约 6325 aa)。两种最常见的人类 MUC5AC 变体小于 NHP 基因模型,这表明在人类最近的进化过程中,蛋白长度减少了。连锁不平衡和 Tajima 的 D 分析表明,东亚人携带异常大的块,在 MUC5AC 处具有过多的罕见变异(p<0.05)。为了验证这一结果,我们在来自 1000 基因组计划的 2600 个无关样本中使用 Locityper 对 MUC5AC 单倍群进行基因分型。我们观察到东亚人中 H1 存在正选择的特征,以及可能的祖先单倍群(H3)的耗尽。在欧洲人中,H3 等位基因显示出常见变异的过剩,并偏离 Hardy-Weinberg 平衡(p<0.05),这与杂合优势和平衡选择一致。本研究为改善疾病相关性提供了一种可推广的方法来描述复杂的蛋白编码 VNTR。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2263/11344006/2a9c13d47262/fx1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验