Department of Biological Sciences, Auburn University, Auburn, AL, USA.
Methods Mol Biol. 2022;2421:151-169. doi: 10.1007/978-1-0716-1944-5_11.
Genome sequences are quickly becoming available from a variety of organisms, providing researchers with an abundance of previously inaccessible information and an important source of insight into immune mechanisms. There are a variety of methods to accurately characterize genes from new genome sequences, but immune receptors pose special challenges for these techniques. Immune receptors, particularly those that directly recognize pathogens, often diverge rapidly among species and are commonly found in large, complex multigene families. Because of these characteristics, immune receptors tend to be overlooked or misannotated in large-scale genomic surveys. We describe here a strategy to characterize homologs of immune receptors and to identify putative receptors from newly assembled genome or transcriptome sequences. The description of these protocols is aimed at a typical immunologist and does not rely on substantial a priori knowledge of bioinformatics. The approach is based on using low-stringency sequence searches to identify divergent homologs. For receptors with multiple domains, the intersection of low-stringency searches can be used to identify divergent receptor sequences with high confidence. For multigene families, these predictions can be refined using sequence conservation among gene family paralogs. Assembled genome sequences serve as a critical foundation for subsequent functional characterization and remove long-standing barriers in understanding the evolution of immune recognition systems.
基因组序列正迅速从各种生物体中获得,为研究人员提供了大量以前无法获取的信息,也是深入了解免疫机制的重要来源。有多种方法可以准确地从新的基因组序列中描述基因,但免疫受体对这些技术提出了特殊的挑战。免疫受体,特别是那些直接识别病原体的受体,在物种间往往迅速分化,并且通常存在于大型复杂的多基因家族中。由于这些特点,免疫受体在大规模基因组调查中往往被忽视或错误注释。我们在这里描述了一种从新组装的基因组或转录组序列中描述免疫受体同源物和鉴定潜在受体的策略。这些方案的描述针对的是典型的免疫学家,并不依赖于生物信息学的大量先验知识。该方法基于使用低严格性序列搜索来识别分化的同源物。对于具有多个结构域的受体,低严格性搜索的交集可用于以高置信度识别具有分化受体序列。对于多基因家族,可以使用基因家族旁系同源物之间的序列保守性来细化这些预测。组装的基因组序列是后续功能特征描述的关键基础,并消除了理解免疫识别系统进化的长期障碍。