Milius Robert P, Heuer Michael, Valiga Daniel, Doroschak Kathryn J, Kennedy Caleb J, Bolon Yung-Tsi, Schneider Joel, Pollack Jane, Kim Hwa Ran, Cereb Nezih, Hollenbach Jill A, Mack Steven J, Maiers Martin
National Marrow Donor Program, MN, USA.
National Marrow Donor Program, MN, USA.
Hum Immunol. 2015 Dec;76(12):963-74. doi: 10.1016/j.humimm.2015.08.001. Epub 2015 Aug 28.
We present an electronic format for exchanging data for HLA and KIR genotyping with extensions for next-generation sequencing (NGS). This format addresses NGS data exchange by refining the Histoimmunogenetics Markup Language (HML) to conform to the proposed Minimum Information for Reporting Immunogenomic NGS Genotyping (MIRING) reporting guidelines (miring.immunogenomics.org). Our refinements of HML include two major additions. First, NGS is supported by new XML structures to capture additional NGS data and metadata required to produce a genotyping result, including analysis-dependent (dynamic) and method-dependent (static) components. A full genotype, consensus sequence, and the surrounding metadata are included directly, while the raw sequence reads and platform documentation are externally referenced. Second, genotype ambiguity is fully represented by integrating Genotype List Strings, which use a hierarchical set of delimiters to represent allele and genotype ambiguity in a complete and accurate fashion. HML also continues to enable the transmission of legacy methods (e.g. site-specific oligonucleotide, sequence-specific priming, and Sequence Based Typing (SBT)), adding features such as allowing multiple group-specific sequencing primers, and fully leveraging techniques that combine multiple methods to obtain a single result, such as SBT integrated with NGS.
我们提出了一种用于交换HLA和KIR基因分型数据的电子格式,并对其进行了扩展以适用于下一代测序(NGS)。这种格式通过完善组织免疫遗传学标记语言(HML)以符合拟议的免疫基因组NGS基因分型报告最低信息(MIRING)报告指南(miring.immunogenomics.org),从而解决了NGS数据交换问题。我们对HML的完善包括两项主要补充内容。首先,新的XML结构支持NGS,以捕获生成基因分型结果所需的额外NGS数据和元数据,包括分析相关(动态)和方法相关(静态)组件。完整的基因型、一致序列及周围的元数据直接包含在内,而原始序列读数和平台文档则进行外部引用。其次,通过整合基因型列表字符串来充分表示基因型的模糊性,该字符串使用分层的分隔符集以完整且准确的方式表示等位基因和基因型的模糊性。HML还继续支持传统方法(例如位点特异性寡核苷酸、序列特异性引物延伸和基于序列的分型(SBT))的传输,增加了诸如允许使用多个组特异性测序引物等功能,并充分利用将多种方法结合以获得单一结果的技术,例如与NGS整合的SBT。