Bietz Stefan, Fährrolfes Rainer, Rarey Matthias
University of Hamburg, ZBH -, Center for Bioinformatics, Bundesstraße 43, 20146, Hamburg, Germany.
Mol Inform. 2016 Dec;35(11-12):593-598. doi: 10.1002/minf.201600043. Epub 2016 May 30.
Structure-based drug design starts with the collection, preparation, and initial analysis of protein structures. With more than 115,000 structures publically available in the Protein Data Bank (PDB), fully automated processes reliably performing these important preprocessing steps are needed. Several tools are available for these tasks, however, most of them do not address the special needs of scientists interested in protein-ligand interactions. In this paper, we summarize our research activities towards an automated processing pipeline from raw PDB data towards ready-to-use protein binding site ensembles. Starting from a single protein structure, the pipeline covers the following phases: Extracting structurally related binding sites from the PDB, aligning disconnected binding site sequences, resolving tautomeric forms and protonation, orienting hydrogens and flippable side-chains, structurally aligning the multitude of binding sites, and performing a reasonable reduction of ensemble structures. The pipeline, named SIENA, creates protein-structural ensembles for the analysis of protein flexibility, molecular design efforts like docking or de novo design within seconds. For the first time, we are able to process the whole PDB in order to create a large collection of protein binding site ensembles. SIENA is available as part of the ZBH ProteinsPlus webserver under http://proteinsplus.zbh.uni-hamburg.de.
基于结构的药物设计始于蛋白质结构的收集、预处理及初步分析。蛋白质数据库(PDB)中有超过115,000个公开可用的结构,因此需要能可靠执行这些重要预处理步骤的全自动流程。有几种工具可用于这些任务,然而,它们中的大多数并未满足对蛋白质-配体相互作用感兴趣的科学家的特殊需求。在本文中,我们总结了我们在从原始PDB数据到可用的蛋白质结合位点集合的自动化处理流程方面的研究活动。从单个蛋白质结构开始,该流程涵盖以下阶段:从PDB中提取结构相关的结合位点、比对不连续的结合位点序列、解析互变异构形式和质子化、确定氢原子和可翻转侧链的方向、对众多结合位点进行结构比对以及合理减少集合结构。这个名为SIENA的流程能在数秒内创建用于分析蛋白质灵活性的蛋白质结构集合,以及用于对接或从头设计等分子设计工作。我们首次能够处理整个PDB,以创建大量蛋白质结合位点集合。SIENA可作为ZBH ProteinsPlus网络服务器的一部分获取,网址为http://proteinsplus.zbh.uni-hamburg.de。