Nesbø Camilla L, Boucher Yan, Dlutek Marlena, Doolittle W Ford
Department of Biochemistry and Molecular Biology, Dalhousie University and Genome Atlantic, 5850 College Street, Halifax, Nova Scotia, Canada, B3H1X5.
Environ Microbiol. 2005 Dec;7(12):2011-26. doi: 10.1111/j.1462-2920.2005.00918.x.
Metagenomic data, especially sequence data from large insert clones, are most useful when reasonable inferences about phylogenetic origins of inserts can be made. Often, clones that bear phylotypic markers (usually ribosomal RNA genes) are sought, but sometimes phylogenetic assignments have been based on the preponderance of blast hits obtained with predicted protein coding sequences (CDSs). Here we use a cloning method which greatly enriches for ribosomal RNA-bearing fosmid clones to ask two questions: (i) how reliably can we judge the phylogenetic origin of a clone (that is, its RNA phylotype) from the sequences of its CDSs? and (ii) how much lateral gene transfer (LGT) do we see, as assessed by CDSs of different phylogenetic origins on the same fosmid? We sequenced 12 rRNA containing fosmid clones, obtained from libraries constructed using DNA isolated from Baltimore harbour sediments. Three of the clones are from bacterial candidate divisions for which no cultured representatives are available, and thus represent the first protein coding sequences from these major bacterial lineages. The amount of LGT was assessed by making phylogenetic trees of all the CDSs in the fosmid clones and comparing the phylogenetic position of the CDS to the rRNA phylotype. We find that the majority of CDSs in each fosmid, 57-96%, agree with their respective rRNA genes. However, we also find that a significant fraction of the CDSs in each fosmid, 7-44%, has been acquired by LGT. In several cases, we can infer co-transfer of functionally related genes, and generate hypotheses about mechanism and ecological significance of transfer.
宏基因组数据,尤其是来自大插入片段克隆的序列数据,在能够对插入片段的系统发育起源做出合理推断时最为有用。通常,会寻找带有系统发育标记(通常是核糖体RNA基因)的克隆,但有时系统发育归属是基于与预测的蛋白质编码序列(CDS)获得的大量比对命中结果。在这里,我们使用一种能极大地富集带有核糖体RNA的fosmid克隆的克隆方法来探讨两个问题:(i)我们能从其CDS序列多可靠地判断一个克隆的系统发育起源(即其RNA系统发育型)?以及(ii)通过同一fosmid上不同系统发育起源的CDS评估,我们能看到多少横向基因转移(LGT)?我们对12个含有rRNA的fosmid克隆进行了测序,这些克隆来自使用从巴尔的摩港沉积物中分离的DNA构建的文库。其中三个克隆来自尚无培养代表菌株的细菌候选类群,因此代表了这些主要细菌谱系的首批蛋白质编码序列。通过构建fosmid克隆中所有CDS的系统发育树并将CDS的系统发育位置与rRNA系统发育型进行比较,评估LGT的数量。我们发现每个fosmid中大多数CDS(57 - 96%)与其各自的rRNA基因一致。然而,我们也发现每个fosmid中有相当一部分CDS(7 - 44%)是通过LGT获得的。在几个案例中,我们可以推断功能相关基因的共同转移,并生成关于转移机制和生态意义的假设。