Haag Eric S, Thomas Cristel G
Department of Biology, University of Maryland, 1210 Biology-Psychology Building, College Park, MD, 20742, USA.
Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada, M5S 3B2.
Methods Mol Biol. 2015;1327:11-21. doi: 10.1007/978-1-4939-2842-2_2.
The genome of the nematode Caenorhabditis elegans was the first of any animal to be sequenced completely, and it remains the "gold standard" for completeness and annotations. Even before the C. elegans genome was completed, however, biologists began examining the generality of its features in the genomes of other Caenorhabditis species. With many such genomes now sequenced and available via WormBase, C. elegans researchers are often confronted with how to interpret comparative genomic data. In this article, we present practical approaches to addressing several common issues, including possible sources of error in homology annotations, the often complex relationships between sequence similarity, orthology, paralogy, and gene family evolution, the impact of sexual mode on genome assemblies and content, and the determination and use of synteny as a tool.
线虫秀丽隐杆线虫的基因组是首个被完全测序的动物基因组,至今仍是基因组完整性和注释方面的“金标准”。然而,甚至在秀丽隐杆线虫基因组完成之前,生物学家就开始研究其特征在其他秀丽隐杆线虫物种基因组中的普遍性。现在有许多这样的基因组已被测序,并可通过线虫基因组数据库获取,秀丽隐杆线虫研究人员常常面临如何解读比较基因组数据的问题。在本文中,我们提出了应对几个常见问题的实用方法,包括同源性注释中可能的错误来源、序列相似性、直系同源、旁系同源和基因家族进化之间通常复杂的关系、生殖方式对基因组组装和内容的影响,以及作为一种工具的染色体共线性的确定和使用。