Shah Manesh, Passovets Sergei, Kim Dongsup, Ellrott Kyle, Wang Li, Vokler Inna, LoCascio Philip, Xu Dong, Xu Ying
Life Sciences Division, Oak Ridge National Laboratory, TN 37830-6480, USA.
Bioinformatics. 2003 Oct 12;19(15):1985-96. doi: 10.1093/bioinformatics/btg262.
Experimental techniques alone cannot keep up with the production rate of protein sequences, while computational techniques for protein structure predictions have matured to such a level to provide reliable structural characterization of proteins at large scale. Integration of multiple computational tools for protein structure prediction can complement experimental techniques.
We present an automated pipeline for protein structure prediction. The centerpiece of the pipeline is our threading-based protein structure prediction system PROSPECT. The pipeline consists of a dozen tools for identification of protein domains and signal peptide, protein triage to determine the protein type (membrane or globular), protein fold recognition, generation of atomic structural models, prediction result validation, etc. Different processing and prediction branches are determined automatically by a prediction pipeline manager based on identified characteristics of the protein. The pipeline has been implemented to run in a heterogeneous computational environment as a client/server system with a web interface. Genome-scale applications on Caenorhabditis elegans, Pyrococcus furiosus and three cyanobacterial genomes are presented.
The pipeline is available at http://compbio.ornl.gov/proteinpipeline/
仅靠实验技术无法跟上蛋白质序列的产生速度,而用于蛋白质结构预测的计算技术已成熟到能够大规模提供可靠的蛋白质结构特征描述。整合多种用于蛋白质结构预测的计算工具可补充实验技术。
我们提出了一种用于蛋白质结构预测的自动化流程。该流程的核心是我们基于穿线法的蛋白质结构预测系统PROSPECT。该流程由一打工具组成,用于识别蛋白质结构域和信号肽、对蛋白质进行分类以确定蛋白质类型(膜蛋白或球状蛋白)、蛋白质折叠识别、生成原子结构模型、预测结果验证等。预测流程管理器会根据所识别的蛋白质特征自动确定不同的处理和预测分支。该流程已实现作为具有Web界面的客户端/服务器系统在异构计算环境中运行。展示了在秀丽隐杆线虫、嗜热栖热菌和三个蓝藻基因组上的基因组规模应用。