Beldade Patrícia, Rudd Stephen, Gruber Jonathan D, Long Anthony D
Department of Ecology and Evolutionary Biology, University of California at Irvine, Irvine, USA.
BMC Genomics. 2006 May 31;7:130. doi: 10.1186/1471-2164-7-130.
Butterfly wing color patterns are a key model for integrating evolutionary developmental biology and the study of adaptive morphological evolution. Yet, despite the biological, economical and educational value of butterflies they are still relatively under-represented in terms of available genomic resources. Here, we describe an Expression Sequence Tag (EST) project for Bicyclus anynana that has identified the largest available collection to date of expressed genes for any butterfly.
By targeting cDNAs from developing wings at the stages when pattern is specified, we biased gene discovery towards genes potentially involved in pattern formation. Assembly of 9,903 ESTs from a subtracted library allowed us to identify 4,251 genes of which 2,461 were annotated based on BLAST analyses against relevant gene collections. Gene prediction software identified 2,202 peptides, of which 215 longer than 100 amino acids had no homology to any known proteins and, thus, potentially represent novel or highly diverged butterfly genes. We combined gene and Single Nucleotide Polymorphism (SNP) identification by constructing cDNA libraries from pools of outbred individuals, and by sequencing clones from the 3' end to maximize alignment depth. Alignments of multi-member contigs allowed us to identify over 14,000 putative SNPs, with 316 genes having at least one high confidence double-hit SNP. We furthermore identified 320 microsatellites in transcribed genes that can potentially be used as genetic markers.
Our project was designed to combine gene and sequence polymorphism discovery and has generated the largest gene collection available for any butterfly and many potential markers in expressed genes. These resources will be invaluable for exploring the potential of B. anynana in particular, and butterflies in general, as models in ecological, evolutionary, and developmental genetics.
蝴蝶翅膀的颜色图案是整合进化发育生物学和适应性形态进化研究的关键模型。然而,尽管蝴蝶具有生物学、经济和教育价值,但就现有的基因组资源而言,它们的代表性仍然相对不足。在此,我们描述了一项针对热带绿眼蝶(Bicyclus anynana)的表达序列标签(EST)项目,该项目已鉴定出迄今为止任何蝴蝶中最大的已表达基因集合。
通过在图案确定阶段靶向发育翅膀中的cDNA,我们使基因发现偏向于可能参与图案形成的基因。对来自消减文库的9903个EST进行组装,使我们能够鉴定出4251个基因,其中2461个基于与相关基因集合的BLAST分析进行了注释。基因预测软件鉴定出2202个肽段,其中215个长度超过100个氨基酸且与任何已知蛋白质均无同源性,因此可能代表新的或高度分化的蝴蝶基因。我们通过从远交个体群体构建cDNA文库,并对3'端的克隆进行测序以最大化比对深度,将基因鉴定与单核苷酸多态性(SNP)鉴定相结合。多成员重叠群的比对使我们能够鉴定出超过14000个推定的SNP,其中316个基因具有至少一个高可信度的双击中SNP。我们还在转录基因中鉴定出320个微卫星,它们有可能用作遗传标记。
我们的项目旨在结合基因和序列多态性发现,并且已经产生了任何蝴蝶中最大的基因集合以及许多表达基因中的潜在标记。这些资源对于探索热带绿眼蝶,特别是一般蝴蝶作为生态、进化和发育遗传学模型的潜力将是非常宝贵的。