秀丽隐杆线虫开放阅读框组3.1版本：通过改进基因预测提高开放阅读框组资源的覆盖范围

C. elegans ORFeome version 3.1: increasing the coverage of ORFeome resources with improved gene predictions.

作者信息

Lamesch Philippe, Milstein Stuart, Hao Tong, Rosenberg Jennifer, Li Ning, Sequerra Reynaldo, Bosak Stephanie, Doucette-Stamm Lynn, Vandenhaute Jean, Hill David E, Vidal Marc

机构信息

Center for Cancer Systems Biology and Department of Cancer Biology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, Massachusetts 02115, USA.

出版信息

Genome Res. 2004 Oct;14(10B):2064-9. doi: 10.1101/gr.2496804.

DOI:10.1101/gr.2496804

PMID:15489327

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC528921/

Abstract

The first version of the Caenorhabditis elegans ORFeome cloning project, based on release WS9 of Wormbase (August 1999), provided experimental verifications for approximately 55% of predicted protein-encoding open reading frames (ORFs). The remaining 45% of predicted ORFs could not be cloned, possibly as a result of mispredicted gene boundaries. Since the release of WS9, gene predictions have improved continuously. To test the accuracy of evolving predictions, we attempted to PCR-amplify from a highly representative worm cDNA library and Gateway-clone approximately 4200 ORFs missed earlier and for which new predictions are available in WS100 (May 2003). In this set we successfully cloned 63% of ORFs with supporting experimental data ("touched" ORFs), and 42% of ORFs with no supporting experimental evidence ("untouched" ORFs). Approximately 2000 full-length ORFs were cloned in-frame, 13% of which were corrected in their exon/intron structure relative to WS100 predictions. In total, approximately 12,500 C. elegans ORFs are now available as Gateway Entry clones for various reverse proteomics (ORFeome v3.1). This work illustrates why the cloning of a complete C. elegans ORFeome, and likely the ORFeomes of other multicellular organisms, needs to be an iterative process that requires multiple rounds of experimental validation together with gradually improving gene predictions.

摘要

基于Wormbase的WS9版本（1999年8月），秀丽隐杆线虫开放阅读框（ORF）克隆计划的第一个版本对约55%的预测蛋白质编码开放阅读框进行了实验验证。其余45%的预测开放阅读框无法克隆，可能是基因边界预测错误所致。自WS9发布以来，基因预测不断改进。为了测试不断发展的预测的准确性，我们尝试从一个具有高度代表性的线虫cDNA文库中进行PCR扩增，并通过Gateway克隆约4200个先前遗漏且在WS100（2003年5月）中有新预测的开放阅读框。在这一组中，我们成功克隆了63%有支持性实验数据的开放阅读框（“已涉及”的开放阅读框），以及42%没有支持性实验证据的开放阅读框（“未涉及”的开放阅读框）。约2000个全长开放阅读框被框内克隆，其中13%相对于WS100的预测在其外显子/内含子结构上得到了校正。现在，总共约12500个秀丽隐杆线虫开放阅读框可作为用于各种反向蛋白质组学的Gateway入门克隆（ORFeome v3.1）。这项工作说明了为什么克隆完整的秀丽隐杆线虫开放阅读框以及可能其他多细胞生物的开放阅读框需要是一个迭代过程，需要多轮实验验证以及逐步改进的基因预测。

相似文献

C. elegans ORFeome version 3.1: increasing the coverage of ORFeome resources with improved gene predictions.

Genome Res. 2004 Oct;14(10B):2064-9. doi: 10.1101/gr.2496804.

C. elegans ORFeome version 1.1: experimental verification of the genome annotation and resource for proteome-scale protein expression.

Nat Genet. 2003 May;34(1):35-41. doi: 10.1038/ng1140.

Closing in on the C. elegans ORFeome by cloning TWINSCAN predictions.

Genome Res. 2005 Apr;15(4):577-82. doi: 10.1101/gr.3329005.

WormBase as an integrated platform for the C. elegans ORFeome.

Genome Res. 2004 Oct;14(10B):2155-61. doi: 10.1101/gr.2521304.

WorfDB: the Caenorhabditis elegans ORFeome Database.

Nucleic Acids Res. 2003 Jan 1;31(1):237-40. doi: 10.1093/nar/gkg092.

Large-scale RACE approach for proactive experimental definition of C. elegans ORFeome.

Genome Res. 2009 Dec;19(12):2334-42. doi: 10.1101/gr.098640.109. Epub 2009 Oct 2.

High-throughput expression of C. elegans proteins.

Genome Res. 2004 Oct;14(10B):2102-10. doi: 10.1101/gr.2520504.

Open-reading-frame sequence tags (OSTs) support the existence of at least 17,300 genes in C. elegans.

Nat Genet. 2001 Mar;27(3):332-6. doi: 10.1038/85913.

Human ORFeome version 1.1: a platform for reverse proteomics.

Genome Res. 2004 Oct;14(10B):2128-35. doi: 10.1101/gr.2973604.

The full-ORF clone resource of the German cDNA Consortium.

BMC Genomics. 2007 Oct 31;8:399. doi: 10.1186/1471-2164-8-399.

引用本文的文献

Characterizing a standardized BioPart for BAG-specific expression in .

MicroPubl Biol. 2024 Mar 12;2024. doi: 10.17912/micropub.biology.001150. eCollection 2024.

SPOP loss of function protects against tauopathy.

Proc Natl Acad Sci U S A. 2023 Jan 3;120(1):e2207250120. doi: 10.1073/pnas.2207250120. Epub 2022 Dec 27.

Proximity labeling identifies LOTUS domain proteins that promote the formation of perinuclear germ granules in .

Elife. 2021 Nov 3;10:e72276. doi: 10.7554/eLife.72276.

Protein interactome mapping in .

Curr Opin Syst Biol. 2019 Feb;13:1-9. doi: 10.1016/j.coisb.2018.08.006.

A functionally defined high-density NRF2 interactome reveals new conditional regulators of ARE transactivation.

Redox Biol. 2020 Oct;37:101686. doi: 10.1016/j.redox.2020.101686. Epub 2020 Aug 20.

The full-length transcriptome of using direct RNA sequencing.

Genome Res. 2020 Feb;30(2):299-312. doi: 10.1101/gr.251314.119. Epub 2020 Feb 5.

Maximizing binary interactome mapping with a minimal number of assays.

Nat Commun. 2019 Aug 29;10(1):3907. doi: 10.1038/s41467-019-11809-2.

A Myt1 family transcription factor defines neuronal fate by repressing non-neuronal genes.

Elife. 2019 Aug 6;8:e46703. doi: 10.7554/eLife.46703.

Gene2Function: An Integrated Online Resource for Gene Function Discovery.

G3 (Bethesda). 2017 Aug 7;7(8):2855-2858. doi: 10.1534/g3.117.043885.

A gene-centered C. elegans protein-DNA interaction network provides a framework for functional predictions.

Mol Syst Biol. 2016 Oct 24;12(10):884. doi: 10.15252/msb.20167131.

本文引用的文献

An integrated gene annotation and transcriptional profiling approach towards the full gene content of the Drosophila genome.

Genome Biol. 2003;5(1):R3. doi: 10.1186/gb-2003-5-1-r3. Epub 2003 Dec 22.

The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics.

PLoS Biol. 2003 Nov;1(2):E45. doi: 10.1371/journal.pbio.0000045. Epub 2003 Nov 17.

Empirical analysis of transcriptional activity in the Arabidopsis genome.

Science. 2003 Oct 31;302(5646):842-6. doi: 10.1126/science.1088305.

Finding functional features in Saccharomyces genomes by phylogenetic footprinting.

Science. 2003 Jul 4;301(5629):71-6. doi: 10.1126/science.1084337. Epub 2003 May 29.

Sequencing and comparison of yeast species to identify genes and regulatory elements.

Nature. 2003 May 15;423(6937):241-54. doi: 10.1038/nature01644.

C. elegans ORFeome version 1.1: experimental verification of the genome annotation and resource for proteome-scale protein expression.

Nat Genet. 2003 May;34(1):35-41. doi: 10.1038/ng1140.

WorfDB: the Caenorhabditis elegans ORFeome Database.

Nucleic Acids Res. 2003 Jan 1;31(1):237-40. doi: 10.1093/nar/gkg092.

A global analysis of Caenorhabditis elegans operons.

Nature. 2002 Jun 20;417(6891):851-4. doi: 10.1038/nature00831.

Integrating genomic homology into gene structure prediction.

Bioinformatics. 2001;17 Suppl 1:S140-8. doi: 10.1093/bioinformatics/17.suppl_1.s140.

Surveying Saccharomyces genomes to identify functional elements by comparative DNA sequence analysis.

Genome Res. 2001 Jul;11(7):1175-86. doi: 10.1101/gr.182901.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

秀丽隐杆线虫开放阅读框组3.1版本：通过改进基因预测提高开放阅读框组资源的覆盖范围

C. elegans ORFeome version 3.1: increasing the coverage of ORFeome resources with improved gene predictions.

作者信息

Lamesch Philippe, Milstein Stuart, Hao Tong, Rosenberg Jennifer, Li Ning, Sequerra Reynaldo, Bosak Stephanie, Doucette-Stamm Lynn, Vandenhaute Jean, Hill David E, Vidal Marc

机构信息

Center for Cancer Systems Biology and Department of Cancer Biology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, Massachusetts 02115, USA.

出版信息

Genome Res. 2004 Oct;14(10B):2064-9. doi: 10.1101/gr.2496804.

DOI:10.1101/gr.2496804

PMID:15489327

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC528921/

Abstract

摘要

秀丽隐杆线虫开放阅读框组3.1版本：通过改进基因预测提高开放阅读框组资源的覆盖范围

C. elegans ORFeome version 3.1: increasing the coverage of ORFeome resources with improved gene predictions.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

秀丽隐杆线虫开放阅读框组3.1版本：通过改进基因预测提高开放阅读框组资源的覆盖范围

C. elegans ORFeome version 3.1: increasing the coverage of ORFeome resources with improved gene predictions.

作者信息

机构信息

出版信息