Howard Hughes Medical Institute, Columbia University, College of Physicians and Surgeons, New York, New York, United States of America.
PLoS One. 2011;6(5):e20085. doi: 10.1371/journal.pone.0020085. Epub 2011 May 25.
C. elegans is an important model for genetic studies relevant to human biology and disease. We sought to assess the orthology between C. elegans and human genes to understand better the relationship between their genomes and to generate a compelling list of candidates to streamline RNAi-based screens in this model.
We performed a meta-analysis of results from four orthology prediction programs and generated a compendium, "OrthoList", containing 7,663 C. elegans protein-coding genes. Various assessments indicate that OrthoList has extensive coverage with low false-positive and false-negative rates. Part of this evaluation examined the conservation of components of the receptor tyrosine kinase, Notch, Wnt, TGF-ß and insulin signaling pathways, and led us to update compendia of conserved C. elegans kinases, nuclear hormone receptors, F-box proteins, and transcription factors. Comparison with two published genome-wide RNAi screens indicated that virtually all of the conserved hits would have been obtained had just the OrthoList set (∼38% of the genome) been targeted. We compiled Ortholist by InterPro domains and Gene Ontology annotation, making it easy to identify C. elegans orthologs of human disease genes for potential functional analysis.
We anticipate that OrthoList will be of considerable utility to C. elegans researchers for streamlining RNAi screens, by focusing on genes with apparent human orthologs, thus reducing screening effort by ∼60%. Moreover, we find that OrthoList provides a useful basis for annotating orthology and reveals more C. elegans orthologs of human genes in various functional groups, such as transcription factors, than previously described.
秀丽隐杆线虫是研究人类生物学和疾病相关遗传的重要模式生物。我们试图评估秀丽隐杆线虫和人类基因之间的同源性,以更好地理解它们的基因组之间的关系,并生成一份有说服力的候选名单,以简化该模型中的 RNAi 筛选。
我们对四个同源性预测程序的结果进行了荟萃分析,并生成了一个包含 7663 个秀丽隐杆线虫蛋白编码基因的综合数据库,称为“OrthoList”。各种评估表明,OrthoList 具有广泛的覆盖范围,假阳性和假阴性率低。该评估的一部分检查了受体酪氨酸激酶、Notch、Wnt、TGF-β和胰岛素信号通路的组成部分的保守性,这导致我们更新了保守秀丽隐杆线虫激酶、核激素受体、F-box 蛋白和转录因子的综合数据库。与两个已发表的全基因组 RNAi 筛选的比较表明,如果仅仅针对 OrthoList 集(约占基因组的 38%)进行靶向,几乎可以获得所有保守的命中。我们根据 InterPro 结构域和基因本体论注释编制了 Ortholist,使人们可以轻松识别秀丽隐杆线虫与人疾病基因的同源物,以便进行潜在的功能分析。
我们预计 OrthoList 将对线虫研究人员非常有用,可通过关注具有明显人类同源物的基因来简化 RNAi 筛选,从而将筛选工作量减少约 60%。此外,我们发现 OrthoList 为注释同源性提供了有用的基础,并揭示了更多在各种功能组(如转录因子)中与人类基因具有同源关系的秀丽隐杆线虫基因,这比以前描述的更多。