整合生物信息学与实验分析的高通量蛋白质分析

High-throughput protein analysis integrating bioinformatics and experimental assays.

作者信息

del Val Coral, Mehrle Alexander, Falkenhahn Mechthild, Seiler Markus, Glatting Karl-Heinz, Poustka Annemarie, Suhai Sandor, Wiemann Stefan

机构信息

Division of Molecular Biophysics, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 580, D-69120 Heidelberg, Germany.

出版信息

Nucleic Acids Res. 2004 Feb 3;32(2):742-8. doi: 10.1093/nar/gkh257. Print 2004.

DOI:10.1093/nar/gkh257

PMID:14762202

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC373366/

Abstract

The wealth of transcript information that has been made publicly available in recent years requires the development of high-throughput functional genomics and proteomics approaches for its analysis. Such approaches need suitable data integration procedures and a high level of automation in order to gain maximum benefit from the results generated. We have designed an automatic pipeline to analyse annotated open reading frames (ORFs) stemming from full-length cDNAs produced mainly by the German cDNA Consortium. The ORFs are cloned into expression vectors for use in large-scale assays such as the determination of subcellular protein localization or kinase reaction specificity. Additionally, all identified ORFs undergo exhaustive bioinformatic analysis such as similarity searches, protein domain architecture determination and prediction of physicochemical characteristics and secondary structure, using a wide variety of bioinformatic methods in combination with the most up-to-date public databases (e.g. PRINTS, BLOCKS, INTERPRO, PROSITE SWISSPROT). Data from experimental results and from the bioinformatic analysis are integrated and stored in a relational database (MS SQL-Server), which makes it possible for researchers to find answers to biological questions easily, thereby speeding up the selection of targets for further analysis. The designed pipeline constitutes a new automatic approach to obtaining and administrating relevant biological data from high-throughput investigations of cDNAs in order to systematically identify and characterize novel genes, as well as to comprehensively describe the function of the encoded proteins.

摘要

近年来公开的大量转录本信息需要开发高通量功能基因组学和蛋白质组学方法来进行分析。此类方法需要合适的数据整合程序和高度自动化，以便从所产生的结果中获得最大收益。我们设计了一个自动流程，用于分析主要由德国cDNA联盟产生的全长cDNA中的注释开放阅读框（ORF）。这些ORF被克隆到表达载体中，用于大规模检测，如亚细胞蛋白质定位测定或激酶反应特异性测定。此外，所有鉴定出的ORF都要经过详尽的生物信息学分析，如相似性搜索、蛋白质结构域结构确定以及理化特性和二级结构预测，使用多种生物信息学方法并结合最新的公共数据库（如PRINTS、BLOCKS、INTERPRO、PROSITE、SWISSPROT）。实验结果和生物信息学分析的数据被整合并存储在关系数据库（MS SQL-Server）中，这使得研究人员能够轻松找到生物学问题的答案，从而加快进一步分析的目标选择。所设计的流程构成了一种新的自动方法，用于从cDNA的高通量研究中获取和管理相关生物学数据，以便系统地鉴定和表征新基因，以及全面描述编码蛋白质的功能。

相似文献

High-throughput protein analysis integrating bioinformatics and experimental assays.整合生物信息学与实验分析的高通量蛋白质分析

Nucleic Acids Res. 2004 Feb 3;32(2):742-8. doi: 10.1093/nar/gkh257. Print 2004.

LIFEdb: a database for functional genomics experiments integrating information from external sources, and serving as a sample tracking system.LIFEdb：一个用于功能基因组学实验的数据库，整合来自外部来源的信息，并作为一个样本跟踪系统。

Nucleic Acids Res. 2004 Jan 1;32(Database issue):D505-8. doi: 10.1093/nar/gkh022.

ORFDB: an information resource linking scientific content to a high-quality Open Reading Frame (ORF) collection.ORFDB：一个将科学内容与高质量开放阅读框（ORF）集合相链接的信息资源。

Nucleic Acids Res. 2004 Jan 1;32(Database issue):D595-9. doi: 10.1093/nar/gkh118.

Integrating forward and reverse proteomics to unravel protein function.整合正向和反向蛋白质组学以揭示蛋白质功能。

Proteomics. 2006 Oct;6(20):5467-80. doi: 10.1002/pmic.200600211.

cDNA2Genome: a tool for mapping and annotating cDNAs.cDNA2Genome：一种用于cDNA定位和注释的工具。

BMC Bioinformatics. 2003 Sep 10;4:39. doi: 10.1186/1471-2105-4-39.

Structure SNP (StSNP): a web server for mapping and modeling nsSNPs on protein structures with linkage to metabolic pathways.结构单核苷酸多态性（StSNP）：一个用于在蛋白质结构上对非同义单核苷酸多态性进行映射和建模并与代谢途径相联系的网络服务器。

Nucleic Acids Res. 2007 Jul;35(Web Server issue):W384-92. doi: 10.1093/nar/gkm232. Epub 2007 May 30.

ProtSweep, 2Dsweep and DomainSweep: protein analysis suite at DKFZ.ProtSweep、2Dsweep和DomainSweep：德国癌症研究中心的蛋白质分析套件。

Nucleic Acids Res. 2007 Jul;35(Web Server issue):W444-50. doi: 10.1093/nar/gkm364. Epub 2007 May 25.

The German cDNA network: cDNAs, functional genomics and proteomics.德国cDNA网络：cDNA、功能基因组学与蛋白质组学

J Struct Funct Genomics. 2003;4(2-3):87-96. doi: 10.1023/a:1026148428520.

Experimental and bioinformatic approaches for interrogating protein-protein interactions to determine protein function.用于探究蛋白质-蛋白质相互作用以确定蛋白质功能的实验和生物信息学方法。

J Mol Endocrinol. 2005 Apr;34(2):263-80. doi: 10.1677/jme.1.01693.

Characterization of 954 bovine full-CDS cDNA sequences.954条牛全长编码序列（CDS）cDNA序列的特征分析

BMC Genomics. 2005 Nov 23;6:166. doi: 10.1186/1471-2164-6-166.

引用本文的文献

Inhibition of UHRF1 Improves Motor Function in Mice with Spinal Cord Injury.抑制UHRF1可改善脊髓损伤小鼠的运动功能。

Cell Mol Neurobiol. 2024 Apr 22;44(1):39. doi: 10.1007/s10571-024-01474-5.

Kinetic modelling of the cellular metabolic responses underpinning in vitro glycolysis assays.体外糖酵解分析中细胞代谢反应的动力学建模。

FEBS Open Bio. 2024 Mar;14(3):466-486. doi: 10.1002/2211-5463.13765. Epub 2024 Jan 12.

Application of NMR and molecular docking in structure-based drug discovery.核磁共振（NMR）与分子对接在基于结构的药物发现中的应用。

Top Curr Chem. 2012;326:1-34. doi: 10.1007/128_2011_213.

The diversity of torque teno viruses: in vitro replication leads to the formation of additional replication-competent subviral molecules.转矩均等病毒的多样性：体外复制导致形成额外具有复制能力的亚病毒分子。

J Virol. 2011 Jul;85(14):7284-95. doi: 10.1128/JVI.02472-10. Epub 2011 May 18.

Solution structure and function of YndB, an AHSA1 protein from Bacillus subtilis.芽孢杆菌 AHSA1 蛋白 YndB 的结构与功能研究

Proteins. 2010 Dec;78(16):3328-40. doi: 10.1002/prot.22840.

The application of FAST-NMR for the identification of novel drug discovery targets.快速核磁共振（FAST-NMR）在新型药物发现靶点识别中的应用。

Drug Discov Today. 2008 Feb;13(3-4):172-9. doi: 10.1016/j.drudis.2007.11.001.

CAFTAN: a tool for fast mapping, and quality assessment of cDNAs.CAFTAN：一种用于cDNA快速定位和质量评估的工具。

BMC Bioinformatics. 2006 Oct 25;7:473. doi: 10.1186/1471-2105-7-473.

The LIFEdb database in 2006.2006年的LIFEdb数据库。

Nucleic Acids Res. 2006 Jan 1;34(Database issue):D415-8. doi: 10.1093/nar/gkj139.

Serological identification and bioinformatics analysis of immunogenic antigens in multiple myeloma.多发性骨髓瘤中免疫原性抗原的血清学鉴定及生物信息学分析

Cancer Immunol Immunother. 2006 Aug;55(8):910-7. doi: 10.1007/s00262-005-0074-x. Epub 2005 Sep 29.

From ORFeome to biology: a functional genomics pipeline.从开放阅读框文库到生物学：一条功能基因组学流程

Genome Res. 2004 Oct;14(10B):2136-44. doi: 10.1101/gr.2576704.

本文引用的文献

Nucleic Acids Res. 2004 Jan 1;32(Database issue):D505-8. doi: 10.1093/nar/gkh022.

cDNA2Genome: a tool for mapping and annotating cDNAs.cDNA2Genome：一种用于cDNA定位和注释的工具。

BMC Bioinformatics. 2003 Sep 10;4:39. doi: 10.1186/1471-2105-4-39.

ESTAnnotator: A tool for high throughput EST annotation.EST注释器：一种用于高通量EST注释的工具。

Nucleic Acids Res. 2003 Jul 1;31(13):3716-9. doi: 10.1093/nar/gkg566.

Protein family classification and functional annotation.蛋白质家族分类与功能注释。

Comput Biol Chem. 2003 Feb;27(1):37-47. doi: 10.1016/s1476-9271(02)00098-1.

A task framework for the web interface W2H.用于网络界面W2H的任务框架。

Bioinformatics. 2003 Jan 22;19(2):278-82. doi: 10.1093/bioinformatics/19.2.278.

PRINTS and its automatic supplement, prePRINTS.PRINTS及其自动补充内容prePRINTS。

Nucleic Acids Res. 2003 Jan 1;31(1):400-2. doi: 10.1093/nar/gkg030.

The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003.2003年的SWISS-PROT蛋白质知识库及其补充TrEMBL。

Nucleic Acids Res. 2003 Jan 1;31(1):365-70. doi: 10.1093/nar/gkg095.

PROSITE: a documented database using patterns and profiles as motif descriptors.PROSITE：一个使用模式和轮廓作为基序描述符的文献数据库。

Brief Bioinform. 2002 Sep;3(3):265-74. doi: 10.1093/bib/3.3.265.

InterPro: an integrated documentation resource for protein families, domains and functional sites.InterPro：蛋白质家族、结构域和功能位点的综合文献资源。

Brief Bioinform. 2002 Sep;3(3):225-35. doi: 10.1093/bib/3.3.225.

The Pfam protein families database.Pfam蛋白质家族数据库。

Nucleic Acids Res. 2002 Jan 1;30(1):276-80. doi: 10.1093/nar/30.1.276.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验