Suppr超能文献

整合生物信息学与实验分析的高通量蛋白质分析

High-throughput protein analysis integrating bioinformatics and experimental assays.

作者信息

del Val Coral, Mehrle Alexander, Falkenhahn Mechthild, Seiler Markus, Glatting Karl-Heinz, Poustka Annemarie, Suhai Sandor, Wiemann Stefan

机构信息

Division of Molecular Biophysics, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 580, D-69120 Heidelberg, Germany.

出版信息

Nucleic Acids Res. 2004 Feb 3;32(2):742-8. doi: 10.1093/nar/gkh257. Print 2004.

Abstract

The wealth of transcript information that has been made publicly available in recent years requires the development of high-throughput functional genomics and proteomics approaches for its analysis. Such approaches need suitable data integration procedures and a high level of automation in order to gain maximum benefit from the results generated. We have designed an automatic pipeline to analyse annotated open reading frames (ORFs) stemming from full-length cDNAs produced mainly by the German cDNA Consortium. The ORFs are cloned into expression vectors for use in large-scale assays such as the determination of subcellular protein localization or kinase reaction specificity. Additionally, all identified ORFs undergo exhaustive bioinformatic analysis such as similarity searches, protein domain architecture determination and prediction of physicochemical characteristics and secondary structure, using a wide variety of bioinformatic methods in combination with the most up-to-date public databases (e.g. PRINTS, BLOCKS, INTERPRO, PROSITE SWISSPROT). Data from experimental results and from the bioinformatic analysis are integrated and stored in a relational database (MS SQL-Server), which makes it possible for researchers to find answers to biological questions easily, thereby speeding up the selection of targets for further analysis. The designed pipeline constitutes a new automatic approach to obtaining and administrating relevant biological data from high-throughput investigations of cDNAs in order to systematically identify and characterize novel genes, as well as to comprehensively describe the function of the encoded proteins.

摘要

近年来公开的大量转录本信息需要开发高通量功能基因组学和蛋白质组学方法来进行分析。此类方法需要合适的数据整合程序和高度自动化,以便从所产生的结果中获得最大收益。我们设计了一个自动流程,用于分析主要由德国cDNA联盟产生的全长cDNA中的注释开放阅读框(ORF)。这些ORF被克隆到表达载体中,用于大规模检测,如亚细胞蛋白质定位测定或激酶反应特异性测定。此外,所有鉴定出的ORF都要经过详尽的生物信息学分析,如相似性搜索、蛋白质结构域结构确定以及理化特性和二级结构预测,使用多种生物信息学方法并结合最新的公共数据库(如PRINTS、BLOCKS、INTERPRO、PROSITE、SWISSPROT)。实验结果和生物信息学分析的数据被整合并存储在关系数据库(MS SQL-Server)中,这使得研究人员能够轻松找到生物学问题的答案,从而加快进一步分析的目标选择。所设计的流程构成了一种新的自动方法,用于从cDNA的高通量研究中获取和管理相关生物学数据,以便系统地鉴定和表征新基因,以及全面描述编码蛋白质的功能。

相似文献

引用本文的文献

8
The LIFEdb database in 2006.2006年的LIFEdb数据库。
Nucleic Acids Res. 2006 Jan 1;34(Database issue):D415-8. doi: 10.1093/nar/gkj139.

本文引用的文献

4
Protein family classification and functional annotation.蛋白质家族分类与功能注释。
Comput Biol Chem. 2003 Feb;27(1):37-47. doi: 10.1016/s1476-9271(02)00098-1.
5
A task framework for the web interface W2H.用于网络界面W2H的任务框架。
Bioinformatics. 2003 Jan 22;19(2):278-82. doi: 10.1093/bioinformatics/19.2.278.
6
PRINTS and its automatic supplement, prePRINTS.PRINTS及其自动补充内容prePRINTS。
Nucleic Acids Res. 2003 Jan 1;31(1):400-2. doi: 10.1093/nar/gkg030.
10
The Pfam protein families database.Pfam蛋白质家族数据库。
Nucleic Acids Res. 2002 Jan 1;30(1):276-80. doi: 10.1093/nar/30.1.276.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验