利用新型 K -mer 特征从基因组测序reads 中准确高效推断 KIR 基因和单倍型。

Accurate and Efficient KIR Gene and Haplotype Inference From Genome Sequencing Reads With Novel K-mer Signatures.

机构信息

Bioinformatics and Computational Biology, University of Minnesota, Rochester, MN, United States.

Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN, United States.

出版信息

Front Immunol. 2020 Nov 26;11:583013. doi: 10.3389/fimmu.2020.583013. eCollection 2020.

DOI:10.3389/fimmu.2020.583013

PMID:33324401

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7727328/

Abstract

The killer-cell immunoglobulin-like receptor (KIR) proteins evolve to fight viruses and mediate the body's reaction to pregnancy. These roles provide selection pressure for variation at both the structural/haplotype and base/allele levels. At the same time, the genes have evolved relatively recently by tandem duplication and therefore exhibit very high sequence similarity over thousands of bases. These variation-homology patterns make it impossible to interpret KIR haplotypes from abundant short-read genome sequencing data at population scale using existing methods. Here, we developed an efficient computational approach for KIR probe interpretation (KPI) to accurately interpret individual's KIR genes and haplotype-pairs from KIR sequencing reads. We designed synthetic 25-base sequence probes by analyzing previously reported haplotype sequences, and we developed a bioinformatics pipeline to interpret the probes in the context of 16 KIR genes and 16 haplotype structures. We demonstrated its accuracy on a synthetic data set as well as a real whole genome sequences from 748 individuals from The Genome of the Netherlands (GoNL). The GoNL predictions were compared with predictions from SNP-based predictions. Our results show 100% accuracy rate for the synthetic tests and a 99.6% family-consistency rate in the GoNL tests. Agreement with the SNP-based calls on KIR genes ranges from 72%-100% with a mean of 92%; most differences occur in genes , , , and where KPI predicts presence and the SNP-based interpretation predicts absence. Overall, the evidence suggests that KPI's accuracy is 97% or greater for both KIR gene and haplotype-pair predictions, and the presence/absence genotyping leads to ambiguous haplotype-pair predictions with 16 reference KIR haplotype structures. KPI is free, open, and easily executable as a Nextflow workflow supported by a Docker environment at https://github.com/droeatumn/kpi.

摘要

杀伤细胞免疫球蛋白样受体（KIR）蛋白进化以对抗病毒并介导机体对妊娠的反应。这些作用为结构/单倍型和碱基/等位基因水平的变异提供了选择压力。同时，这些基因通过串联重复相对较新进化而来，因此在数千个碱基上表现出非常高的序列相似性。这些变异-同源模式使得使用现有的方法无法从丰富的短读长基因组测序数据中在群体水平上解释 KIR 单倍型。在这里，我们开发了一种有效的 KIR 探针解释（KPI）计算方法，用于从 KIR 测序reads 中准确解释个体的 KIR 基因和单倍型对。我们通过分析先前报道的单倍型序列设计了合成的 25 碱基序列探针，并开发了一个生物信息学管道来解释 16 个 KIR 基因和 16 个单倍型结构中的探针。我们在合成数据集以及来自 748 名来自荷兰基因组（GoNL）的个体的真实全基因组序列上证明了其准确性。GoNL 的预测与基于 SNP 的预测进行了比较。我们的结果在合成测试中达到了 100％的准确率，在 GoNL 测试中达到了 99.6％的家族一致性率。与 SNP 调用的 KIR 基因的一致性范围为 72％-100％，平均值为 92％；大多数差异发生在基因、、和中，其中 KPI 预测存在而 SNP 基的解释预测不存在。总体而言，证据表明 KPI 对 KIR 基因和单倍型对的预测准确率达到 97％或更高，并且存在/不存在基因分型导致 16 个参考 KIR 单倍型结构的模糊单倍型对预测。KPI 是免费的、开放的，并且可以作为一个 Nextflow 工作流程轻松执行，该流程在 https://github.com/droeatumn/kpi 上得到了 Docker 环境的支持。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca78/7727328/afc88a3fb3e5/fimmu-11-583013-g001.jpg

相似文献

Accurate and Efficient KIR Gene and Haplotype Inference From Genome Sequencing Reads With Novel K-mer Signatures.利用新型 K -mer 特征从基因组测序reads 中准确高效推断 KIR 基因和单倍型。

Front Immunol. 2020 Nov 26;11:583013. doi: 10.3389/fimmu.2020.583013. eCollection 2020.

Efficient Sequencing, Assembly, and Annotation of Human KIR Haplotypes.高效的人类杀伤细胞免疫球蛋白样受体（KIR）单倍型测序、组装和注释。

Front Immunol. 2020 Oct 9;11:582927. doi: 10.3389/fimmu.2020.582927. eCollection 2020.

A Detailed View of KIR Haplotype Structures and Gene Families as Provided by a New Motif-Based Multiple Sequence Alignment.基于新基序的多位点序列比对揭示的 KIR 单倍型结构和基因家族的详细信息。

Front Immunol. 2020 Nov 18;11:585731. doi: 10.3389/fimmu.2020.585731. eCollection 2020.

High-throughput Interpretation of Killer-cell Immunoglobulin-like Receptor Short-read Sequencing Data with PING.使用 PING 进行杀伤细胞免疫球蛋白样受体短读测序数据的高通量解读。

PLoS Comput Biol. 2021 Aug 2;17(8):e1008904. doi: 10.1371/journal.pcbi.1008904. eCollection 2021 Aug.

Revealing complete complex KIR haplotypes phased by long-read sequencing technology.通过长读长测序技术揭示完整的复杂杀伤细胞免疫球蛋白样受体（KIR）单倍型。

Genes Immun. 2017 Sep;18(3):127-134. doi: 10.1038/gene.2017.10. Epub 2017 Jun 1.

Molecular characterisation of KIR2DS2*005, a fusion gene associated with a shortened KIR haplotype.KIR2DS2*005 的分子特征，与短型 KIR 单倍型相关的融合基因。

Genes Immun. 2011 Oct;12(7):544-51. doi: 10.1038/gene.2011.35. Epub 2011 May 19.

Genetic complexity of killer-cell immunoglobulin-like receptor genes in human pangenome assemblies.人类泛基因组组装中杀伤细胞免疫球蛋白样受体基因的遗传复杂性。

Genome Res. 2024 Sep 20;34(8):1211-1223. doi: 10.1101/gr.278358.123.

Contribution of genes for killer cell immunoglobulin-like receptors (KIR) to the susceptibility to chronic hepatitis C virus infection and to viremia.杀伤细胞免疫球蛋白样受体（KIR）基因对慢性丙型肝炎病毒感染易感性及病毒血症的影响。

Hum Immunol. 2015 Mar;76(2-3):102-8. doi: 10.1016/j.humimm.2015.01.020. Epub 2015 Jan 27.

Haplotype-Based Analysis of -Gene Profiles in a South European Population-Distribution of Standard and Variant Haplotypes, and Identification of Novel Recombinant Structures.基于单体型的 - 基因谱在南欧人群中的分析-标准和变异单体型的分布，以及新的重组结构的鉴定。

Front Immunol. 2020 Mar 17;11:440. doi: 10.3389/fimmu.2020.00440. eCollection 2020.

Estimating KIR Haplotype Frequencies on a Cohort of 10,000 Individuals: A Comprehensive Study on Population Variations, Typing Resolutions, and Reference Haplotypes.估计一万名个体队列中的杀伤细胞免疫球蛋白样受体单倍型频率：关于群体变异、分型分辨率和参考单倍型的综合研究

PLoS One. 2016 Oct 10;11(10):e0163973. doi: 10.1371/journal.pone.0163973. eCollection 2016.

引用本文的文献

Geny: a genotyping tool for allelic decomposition of killer cell immunoglobulin-like receptor genes.Geny：一种用于杀伤细胞免疫球蛋白样受体基因等位基因分解的基因分型工具。

Front Immunol. 2024 Dec 23;15:1494995. doi: 10.3389/fimmu.2024.1494995. eCollection 2024.

High KIR diversity in Uganda and Botswana children living with HIV.乌干达和博茨瓦纳感染艾滋病毒儿童的杀伤细胞免疫球蛋白样受体多样性高。

bioRxiv. 2024 Dec 7:2024.12.03.626612. doi: 10.1101/2024.12.03.626612.

Efficient and accurate KIR and HLA genotyping with massively parallel sequencing data.采用高通量测序数据进行高效、准确的 KIR 和 HLA 基因分型。

Genome Res. 2023 Jun;33(6):923-931. doi: 10.1101/gr.277585.122. Epub 2023 May 11.

Decoding the diversity of killer immunoglobulin-like receptors by deep sequencing and a high-resolution imputation method.通过深度测序和高分辨率归因方法解析杀伤细胞免疫球蛋白样受体的多样性

Cell Genom. 2022 Mar 9;2(3):100101. doi: 10.1016/j.xgen.2022.100101.

ERAP/HLA-C and KIR Genetic Profile in Couples with Recurrent Implantation Failure.反复着床失败患者的 ERAP/HLA-C 和 KIR 遗传特征。

Int J Mol Sci. 2022 Oct 19;23(20):12518. doi: 10.3390/ijms232012518.

Killer Cell Immunoglobulin-Like Receptor Haplotype B Modulates Susceptibility to EBV-Associated Classic Hodgkin Lymphoma.杀伤细胞免疫球蛋白样受体单倍型 B 调节 EBV 相关经典霍奇金淋巴瘤易感性。

Front Immunol. 2022 Jan 27;13:829943. doi: 10.3389/fimmu.2022.829943. eCollection 2022.

KIR gene content imputation from single-nucleotide polymorphisms in the Finnish population.从芬兰人群的单核苷酸多态性推断 KIR 基因含量。

PeerJ. 2022 Jan 7;10:e12692. doi: 10.7717/peerj.12692. eCollection 2022.

Increased risk of severe clinical course of COVID-19 in carriers of HLA-C*04:01.HLA-C*04:01携带者感染新型冠状病毒肺炎后出现严重临床病程的风险增加。

EClinicalMedicine. 2021 Oct;40:101099. doi: 10.1016/j.eclinm.2021.101099. Epub 2021 Sep 2.

PLoS Comput Biol. 2021 Aug 2;17(8):e1008904. doi: 10.1371/journal.pcbi.1008904. eCollection 2021 Aug.

HIV-1 and human genetic variation.HIV-1 和人类遗传变异。

Nat Rev Genet. 2021 Oct;22(10):645-657. doi: 10.1038/s41576-021-00378-0. Epub 2021 Jun 24.

本文引用的文献

Front Immunol. 2020 Nov 18;11:585731. doi: 10.3389/fimmu.2020.585731. eCollection 2020.

In silico tools for accurate HLA and KIR inference from clinical sequencing data empower immunogenetics on individual-patient and population scales.用于从临床测序数据准确推断HLA和KIR的计算机工具在个体患者和群体层面增强了免疫遗传学研究能力。

Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa223.

Killer-cell immunoglobulin-like receptor assessment algorithms in haemopoietic progenitor cell transplantation: current perspectives and future opportunities.造血祖细胞移植中杀伤细胞免疫球蛋白样受体评估算法：当前观点与未来机遇

HLA. 2020 May;95(5):435-448. doi: 10.1111/tan.13817. Epub 2020 Feb 7.

Revealing complete complex KIR haplotypes phased by long-read sequencing technology.通过长读长测序技术揭示完整的复杂杀伤细胞免疫球蛋白样受体（KIR）单倍型。

Genes Immun. 2017 Sep;18(3):127-134. doi: 10.1038/gene.2017.10. Epub 2017 Jun 1.

PLoS One. 2016 Oct 10;11(10):e0163973. doi: 10.1371/journal.pone.0163973. eCollection 2016.

Defining KIR and HLA Class I Genotypes at Highest Resolution via High-Throughput Sequencing.通过高通量测序以最高分辨率定义杀伤细胞免疫球蛋白样受体（KIR）和I类人类白细胞抗原（HLA）基因型。

Am J Hum Genet. 2016 Aug 4;99(2):375-91. doi: 10.1016/j.ajhg.2016.06.023.

Allele Frequencies Net Database: Improvements for storage of individual genotypes and analysis of existing data.等位基因频率网络数据库：个体基因型存储及现有数据分析的改进

Hum Immunol. 2016 Mar;77(3):238-248. doi: 10.1016/j.humimm.2015.11.013. Epub 2015 Nov 14.

Imputation of KIR Types from SNP Variation Data.从单核苷酸多态性变异数据推断杀伤细胞免疫球蛋白样受体类型

Am J Hum Genet. 2015 Oct 1;97(4):593-607. doi: 10.1016/j.ajhg.2015.09.005.

Human KIR repertoires: shaped by genetic diversity and evolution.人类 KIR 基因库：由遗传多样性和进化塑造。

Immunol Rev. 2015 Sep;267(1):178-96. doi: 10.1111/imr.12316.

The IPD and IMGT/HLA database: allele variant databases.国际参与者数据（IPD）和国际免疫遗传学信息系统/HLA数据库：等位基因变异数据库。

Nucleic Acids Res. 2015 Jan;43(Database issue):D423-31. doi: 10.1093/nar/gku1161. Epub 2014 Nov 20.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

利用新型 K -mer 特征从基因组测序reads 中准确高效推断 KIR 基因和单倍型。

Accurate and Efficient KIR Gene and Haplotype Inference From Genome Sequencing Reads With Novel K-mer Signatures.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献