Haft Daniel H
J Craig Venter Institute, Rockville, MD, USA.
Curr Opin Microbiol. 2015 Feb;23:189-96. doi: 10.1016/j.mib.2014.11.017. Epub 2015 Jan 21.
Bioinformatics looks to many microbiologists like a service industry. In this view, annotation starts with what is known from experiments in the lab, makes reasonable inferences of which genes match other genes in function, builds databases to make all that we know accessible, but creates nothing truly new. Experiments lead, then biocuration and computational biology follow. But the astounding success of genome sequencing is changing the annotation paradigm. Every genome sequenced is an intercepted coded message from the microbial world, and as all cryptographers know, it is easier to decode a thousand messages than a single message. Some biology is best discovered not by phenomenology, but by decoding genome content, forming hypotheses, and doing the first few rounds of validation computationally. Through such reasoning, a role and function may be assigned to a protein with no sequence similarity to any protein yet studied. Experimentation can follow after the discovery to cement and to extend the findings. Unfortunately, this approach remains so unfamiliar to most bench scientists that lab work and comparative genomics typically segregate to different teams working on unconnected projects. This review will discuss several themes in comparative genomics as a discovery method, including highly derived data, use of patterns of design to reason by analogy, and in silico testing of computationally generated hypotheses.
在许多微生物学家看来,生物信息学就像是一个服务业。按照这种观点,注释工作始于实验室实验中已知的信息,对哪些基因在功能上与其他基因匹配进行合理推断,建立数据库以便我们能获取所有已知信息,但并不会创造出真正全新的东西。实验先行,然后是生物编目和计算生物学跟进。然而,基因组测序的惊人成功正在改变注释模式。每一个测序的基因组都是来自微生物世界的一条被截获的编码信息,而且正如所有密码学家所知,解码一千条信息要比解码一条信息更容易。有些生物学知识最好不是通过现象学来发现,而是通过解码基因组内容、形成假设并通过计算进行最初几轮验证来发现。通过这样的推理,可以为一种与任何已研究过的蛋白质都没有序列相似性的蛋白质赋予一个角色和功能。在发现之后可以进行实验来巩固和扩展这些发现。不幸的是,大多数实验科学家对这种方法仍然非常陌生,以至于实验室工作和比较基因组学通常由不同的团队在互不相关的项目中进行。本综述将讨论比较基因组学作为一种发现方法的几个主题,包括高度衍生的数据、利用设计模式进行类比推理以及对计算生成的假设进行计算机模拟测试。