Suppr超能文献

利用 PacBio 长读测序技术解析马主要组织相容性复合体 II 区的基因组结构。

Genomic structure of the horse major histocompatibility complex class II region resolved using PacBio long-read sequencing technology.

机构信息

Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences (SLU), Box 7023, 750 07 Uppsala, Sweden.

Department of Veterinary Integrative Biosciences, College of Veterinary Medicine, Texas A&M University, College Station, TX 77843, USA.

出版信息

Sci Rep. 2017 Mar 31;7:45518. doi: 10.1038/srep45518.

Abstract

The mammalian Major Histocompatibility Complex (MHC) region contains several gene families characterized by highly polymorphic loci with extensive nucleotide diversity, copy number variation of paralogous genes, and long repetitive sequences. This structural complexity has made it difficult to construct a reliable reference sequence of the horse MHC region. In this study, we used long-read single molecule, real-time (SMRT) sequencing technology from Pacific Biosciences (PacBio) to sequence eight Bacterial Artificial Chromosome (BAC) clones spanning the horse MHC class II region. The final assembly resulted in a 1,165,328 bp continuous gap free sequence with 35 manually curated genomic loci of which 23 were considered to be functional and 12 to be pseudogenes. In comparison to the MHC class II region in other mammals, the corresponding region in horse shows extraordinary copy number variation and different relative location and directionality of the Eqca-DRB, -DQA, -DQB and -DOB loci. This is the first long-read sequence assembly of the horse MHC class II region with rigorous manual gene annotation, and it will serve as an important resource for association studies of immune-mediated equine diseases and for evolutionary analysis of genetic diversity in this region.

摘要

哺乳动物的主要组织相容性复合体(MHC)区域包含几个基因家族,这些基因家族的特点是高度多态性位点,具有广泛的核苷酸多样性、同源基因的拷贝数变异和长重复序列。这种结构的复杂性使得构建马 MHC 区域的可靠参考序列变得困难。在这项研究中,我们使用来自 Pacific Biosciences(PacBio)的长读单分子实时(SMRT)测序技术,对跨越马 MHC 类 II 区域的八个细菌人工染色体(BAC)克隆进行测序。最终的组装得到了一个 1165328bp 的连续无间隙序列,其中包含 35 个经过人工精心编辑的基因组位点,其中 23 个被认为是功能性的,12 个是假基因。与其他哺乳动物的 MHC 类 II 区域相比,马的相应区域显示出非凡的拷贝数变异,以及 Eqca-DRB、-DQA、-DQB 和 -DOB 基因座的相对位置和方向不同。这是马 MHC 类 II 区域的第一个经过严格人工基因注释的长读序列组装,它将成为与免疫介导的马疾病相关的关联研究以及该区域遗传多样性进化分析的重要资源。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07ad/5374520/d637c7e86986/srep45518-f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验