UPMC Univ Paris 06, UMR 7628, MBCE, Observatoire Océanologique, F-66651, Banyuls/mer, France.
Genome Biol. 2008 Jan 7;9(1):R5. doi: 10.1186/gb-2008-9-1-r5.
With genome sequencing becoming more and more affordable, environmental shotgun sequencing of the microorganisms present in an environment generates a challenging amount of sequence data for the scientific community. These sequence data enable the diversity of the microbial world and the metabolic pathways within an environment to be investigated, a previously unthinkable achievement when using traditional approaches. DNA sequence data assembled from extracts of 0.8 microm filtered Sargasso seawater unveiled an unprecedented glimpse of marine prokaryotic diversity and gene content. Serendipitously, many sequences representing picoeukaryotes (cell size <2 microm) were also present within this dataset. We investigated the picoeukaryotic diversity of this database by searching sequences containing homologs of eight nuclear anchor genes that are well conserved throughout the eukaryotic lineage, as well as one chloroplastic and one mitochondrial gene.
We found up to 41 distinct eukaryotic scaffolds, with a broad phylogenetic spread on the eukaryotic tree of life. The average eukaryotic scaffold size is 2,909 bp, with one gap every 1,253 bp. Strikingly, the AT frequency of the eukaryotic sequences (51.4%) is significantly lower than the average AT frequency of the metagenome (61.4%). This represents 4% to 18% of the estimated prokaryotic diversity, depending on the average prokaryotic versus eukaryotic genome size ratio.
Despite similar cell size, eukaryotic sequences of the Sargasso Sea metagenome have higher GC content, suggesting that different environmental pressures affect the evolution of their base composition.
随着基因组测序变得越来越实惠,对环境中微生物进行环境鸟枪法测序会产生大量的序列数据,这对科学界来说是一个极具挑战性的任务。这些序列数据能够调查微生物世界的多样性和环境中的代谢途径,这是传统方法以前无法想象的成就。从 0.8 微米过滤的马尾藻海水中提取的 DNA 序列数据揭示了海洋原核生物多样性和基因含量的前所未有的景象。偶然的是,在这个数据集内还存在许多代表微微型真核生物(细胞大小<2 微米)的序列。我们通过搜索包含八个核锚定基因同源物的序列来研究该数据库中的微微型真核生物多样性,这些基因在整个真核生物谱系中都很好地保守,此外还有一个叶绿体和一个线粒体基因。
我们发现了多达 41 个不同的真核生物支架,在真核生物生命之树上广泛分布。真核生物支架的平均大小为 2909bp,每隔 1253bp 就有一个缺口。引人注目的是,真核生物序列的 AT 频率(51.4%)明显低于宏基因组的平均 AT 频率(61.4%)。这代表了估计的原核生物多样性的 4%至 18%,具体取决于平均原核生物与真核生物基因组大小的比例。
尽管细胞大小相似,但马尾藻海宏基因组中的真核生物序列具有更高的 GC 含量,这表明不同的环境压力会影响其碱基组成的进化。