Artamonova Irena I, Lappi Tanya, Zudina Liudmila, Mushegian Arcady R
N.I. Vavilov Institute of General Genetics RAS, Gubkina 3, Moscow, 119991, Russia.
A.A.Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Bol'shoy Karetny 19, Moscow, 127994, Russia.
Environ Microbiol. 2015 Jul;17(7):2203-8. doi: 10.1111/1462-2920.12854. Epub 2015 Apr 28.
Assessment of phylogenetic positions of predicted gene and protein sequences is a routine step in any genome project, useful for validating the species' taxonomic position and for evaluating hypotheses about genome evolution and function. Several recent eukaryotic genome projects have reported multiple gene sequences that were much more similar to homologues in bacteria than to any eukaryotic sequence. In the spirit of the times, horizontal gene transfer from bacteria to eukaryotes has been invoked in some of these cases. Here, we show, using comparative sequence analysis, that some of those bacteria-like genes indeed appear likely to have been horizontally transferred from bacteria to eukaryotes. In other cases, however, the evidence strongly indicates that the eukaryotic DNA sequenced in the genome project contains a sample of non-integrated DNA from the actual bacteria, possibly providing a window into the host microbiome. Recent literature suggests also that common reagents, kits and laboratory equipment may be systematically contaminated with bacterial DNA, which appears to be sampled by metagenome projects non-specifically. We review several bioinformatic criteria that help to distinguish putative horizontal gene transfers from the admixture of genes from autonomously replicating bacteria in their hosts' genome databases or from the reagent contamination.
评估预测的基因和蛋白质序列的系统发育位置是任何基因组计划中的常规步骤,有助于验证物种的分类地位,并评估有关基因组进化和功能的假设。最近的几个真核生物基因组计划报告了多个基因序列,这些序列与细菌中的同源物比与任何真核生物序列更为相似。出于时代精神,在其中一些情况下,有人提出了从细菌到真核生物的水平基因转移。在这里,我们通过比较序列分析表明,其中一些类似细菌的基因确实似乎很可能是从细菌水平转移到真核生物的。然而,在其他情况下,证据强烈表明基因组计划中测序的真核生物DNA包含来自实际细菌的非整合DNA样本,这可能为宿主微生物群提供了一个窗口。最近的文献还表明,常用试剂、试剂盒和实验室设备可能会被细菌DNA系统性污染,而宏基因组计划似乎会非特异性地对其进行采样。我们回顾了几个生物信息学标准,这些标准有助于区分假定的水平基因转移与宿主基因组数据库中自主复制细菌的基因混合或试剂污染。