Hughes Joseph, Longhorn Stuart J, Papadopoulou Anna, Theodorides Kosmas, de Riva Alessandra, Mejia-Chang Monica, Foster Peter G, Vogler Alfried P
Department of Entomology, The Natural History Museum, London, United Kingdom.
Mol Biol Evol. 2006 Feb;23(2):268-78. doi: 10.1093/molbev/msj041. Epub 2005 Oct 19.
Expressed sequence tag (EST) sequences can provide a wealth of data for phylogenetic and genomic studies, but the utility of these resources is restricted by poor taxonomic sampling. Here, we use small EST libraries (<1,000 clones) to generate phylogenetic markers across a broad sample of insects, focusing on the species-rich Coleoptera (beetles). We sequenced over 23,000 ESTs from 34 taxa, which produced 8,728 unique sequences after clustering nonredundant sequences. Between taxa, the sequences could be grouped into 731 gene clusters, with the largest corresponding to mitochondrial DNA transcripts and gene families chymotrypsin, actin, troponin, and tubulin. While levels of paralogy were high in most gene clusters, several midsized clusters including many ribosomal protein (RP) genes appeared to be free of expressed paralogs. To evaluate the utility of EST data for molecular systematics, we curated available transcripts for 66 RP genes from representatives of the major groups of Coleoptera. Using supertree and supermatrix approaches for phylogenetic analysis, the results were consistent with the emerging phylogenetic conclusions about basal relationships in Coleoptera. Numerous small EST libraries from a taxonomically densely sampled lineage can provide a core set of genes that together act as a scaffold in phylogenetic reconstruction, comparative genomics, and studies of gene evolution.
表达序列标签(EST)序列可为系统发育和基因组研究提供丰富的数据,但这些资源的效用受到分类采样不足的限制。在这里,我们使用小型EST文库(<1000个克隆)在广泛的昆虫样本中生成系统发育标记,重点关注物种丰富的鞘翅目(甲虫)。我们对来自34个分类单元的超过23000个EST进行了测序,在对非冗余序列进行聚类后产生了8728个独特序列。在分类单元之间,这些序列可分为731个基因簇,其中最大的对应于线粒体DNA转录本以及胰凝乳蛋白酶、肌动蛋白、肌钙蛋白和微管蛋白等基因家族。虽然大多数基因簇中的旁系同源水平很高,但包括许多核糖体蛋白(RP)基因在内的几个中型簇似乎没有表达的旁系同源物。为了评估EST数据在分子系统学中的效用,我们整理了鞘翅目主要类群代表的66个RP基因的可用转录本。使用超树和超矩阵方法进行系统发育分析,结果与关于鞘翅目基部关系的新出现的系统发育结论一致。来自分类密集采样谱系的大量小型EST文库可以提供一组核心基因,这些基因共同在系统发育重建、比较基因组学和基因进化研究中起到支架作用。