Suppr超能文献

一个包含7000个人类开放阅读框克隆的生物医学富集文库。

A biomedically enriched collection of 7000 human ORF clones.

作者信息

Rolfs Andreas, Hu Yanhui, Ebert Lars, Hoffmann Dietmar, Zuo Dongmei, Ramachandran Niro, Raphael Jacob, Kelley Fontina, McCarron Seamus, Jepson Daniel A, Shen Binghua, Baqui Munira M A, Pearlberg Joseph, Taycher Elena, DeLoughery Craig, Hoerlein Andreas, Korn Bernhard, LaBaer Joshua

机构信息

Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts, USA.

出版信息

PLoS One. 2008 Jan 30;3(1):e1528. doi: 10.1371/journal.pone.0001528.

Abstract

We report the production and availability of over 7000 fully sequence verified plasmid ORF clones representing over 3400 unique human genes. These ORF clones were derived using the human MGC collection as template and were produced in two formats: with and without stop codons. Thus, this collection supports the production of either native protein or proteins with fusion tags added to either or both ends. The template clones used to generate this collection were enriched in three ways. First, gene redundancy was removed. Second, clones were selected to represent the best available GenBank reference sequence. Finally, a literature-based software tool was used to evaluate the list of target genes to ensure that it broadly reflected biomedical research interests. The target gene list was compared with 4000 human diseases and over 8500 biological and chemical MeSH classes in approximately 15 Million publications recorded in PubMed at the time of analysis. The outcome of this analysis revealed that relative to the genome and the MGC collection, this collection is enriched for the presence of genes with published associations with a wide range of diseases and biomedical terms without displaying a particular bias towards any single disease or concept. Thus, this collection is likely to be a powerful resource for researchers who wish to study protein function in a set of genes with documented biomedical significance.

摘要

我们报告了超过7000个经全序列验证的质粒ORF克隆的产生及可用性,这些克隆代表了超过3400个独特的人类基因。这些ORF克隆以人类MGC文库为模板衍生而来,有两种形式:含终止密码子和不含终止密码子。因此,该文库支持天然蛋白的产生,也支持在一端或两端添加融合标签的蛋白的产生。用于生成该文库的模板克隆通过三种方式进行了富集。首先,去除了基因冗余。其次,选择克隆以代表最佳可用的GenBank参考序列。最后,使用基于文献的软件工具评估目标基因列表,以确保其广泛反映生物医学研究兴趣。在分析时,将目标基因列表与PubMed中记录的约1500万篇出版物中的4000种人类疾病以及超过8500个生物和化学MeSH类别进行了比较。该分析结果表明,相对于基因组和MGC文库,该文库富集了与多种疾病和生物医学术语有已发表关联的基因,且未对任何单一疾病或概念表现出特定偏向。因此,对于希望在一组具有已记录生物医学意义的基因中研究蛋白质功能的研究人员来说,该文库可能是一个强大的资源。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2672/2211400/b9d882c70a8e/pone.0001528.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验