Department of Plant and Wildlife Sciences, Brigham Young University, Provo, UT 84602.
LOEWE Centre for Translational Biodiversity Genomics, Frankfurt 60325, Germany.
Proc Natl Acad Sci U S A. 2023 May 2;120(18):e2221528120. doi: 10.1073/pnas.2221528120. Epub 2023 Apr 24.
Arthropod silk is vital to the evolutionary success of hundreds of thousands of species. The primary proteins in silks are often encoded by long, repetitive gene sequences. Until recently, sequencing and assembling these complex gene sequences has proven intractable given their repetitive structure. Here, using high-quality long-read sequencing, we show that there is extensive variation-both in terms of length and repeat motif order-between alleles of silk genes within individual arthropods. Further, this variation exists across two deep, independent origins of silk which diverged more than 500 Mya: the insect clade containing caddisflies and butterflies and spiders. This remarkable convergence in previously overlooked patterns of allelic variation across multiple origins of silk suggests common mechanisms for the generation and maintenance of structural protein-coding genes. Future genomic efforts to connect genotypes to phenotypes should account for such allelic variation.
节肢动物的丝对于成千上万种物种的进化成功至关重要。丝中的主要蛋白质通常由长的、重复的基因序列编码。直到最近,由于其重复的结构,对这些复杂的基因序列进行测序和组装一直是难以解决的问题。在这里,我们使用高质量的长读测序技术表明,在单个节肢动物的丝基因等位基因之间存在广泛的变异——无论是在长度还是重复基序顺序方面。此外,这种变异存在于两个深度独立的丝起源中,它们在 5 亿多年前就已经分化:包含石蛾和蝴蝶以及蜘蛛的昆虫分支。这种在多个丝起源中以前被忽视的等位基因变异模式的惊人趋同表明,结构蛋白编码基因的产生和维持存在共同的机制。未来将基因型与表型联系起来的基因组研究应该考虑到这种等位基因变异。