Lee Man-Ling, Aliagas Ignacio, Feng Jianwen A, Gabriel Thomas, O'Donnell T J, Sellers Benjamin D, Wiswedel Bernd, Gobbi Alberto
Discovery Chemistry, Genentech Inc., 1 DNA Way, South San Francisco, CA, 94080, USA.
Denali Therapeutics, 151 Oyster Point Blvd, South San Francisco, CA, 94080, USA.
J Cheminform. 2017 Jun 12;9(1):38. doi: 10.1186/s13321-017-0228-9.
Analyzing files containing chemical information is at the core of cheminformatics. Each analysis may require a unique workflow. This paper describes the chemalot and chemalot_knime open source packages. Chemalot is a set of command line programs with a wide range of functionalities for cheminformatics. The chemalot_knime package allows command line programs that read and write SD files from stdin and to stdout to be wrapped into KNIME nodes. The combination of chemalot and chemalot_knime not only facilitates the compilation and maintenance of sequences of command line programs but also allows KNIME workflows to take advantage of the compute power of a LINUX cluster.
Use of the command line programs is demonstrated in three different workflow examples: (1) A workflow to create a data file with project-relevant data for structure-activity or property analysis and other type of investigations, (2) The creation of a quantitative structure-property-relationship model using the command line programs via KNIME nodes, and (3) The analysis of strain energy in small molecule ligand conformations from the Protein Data Bank database.
The chemalot and chemalot_knime packages provide lightweight and powerful tools for many tasks in cheminformatics. They are easily integrated with other open source and commercial command line tools and can be combined to build new and even more powerful tools. The chemalot_knime package facilitates the generation and maintenance of user-defined command line workflows, taking advantage of the graphical design capabilities in KNIME. Graphical abstract Example KNIME workflow with chemalot nodes and the corresponding command line pipe.
分析包含化学信息的文件是化学信息学的核心。每次分析可能都需要一个独特的工作流程。本文介绍了chemalot和chemalot_knime开源软件包。Chemalot是一组具有广泛化学信息学功能的命令行程序。chemalot_knime软件包允许将从标准输入读取和写入SD文件到标准输出的命令行程序包装成KNIME节点。chemalot和chemalot_knime的结合不仅便于命令行程序序列的编写和维护,还允许KNIME工作流程利用Linux集群的计算能力。
在三个不同的工作流程示例中展示了命令行程序的使用:(1)创建一个包含与项目相关数据的数据文件,用于结构活性或性质分析以及其他类型的研究;(2)通过KNIME节点使用命令行程序创建定量结构-性质关系模型;(3)分析蛋白质数据库中小分子配体构象的应变能。
chemalot和chemalot_knime软件包为化学信息学中的许多任务提供了轻量级且强大的工具。它们易于与其他开源和商业命令行工具集成,并可组合构建新的、甚至更强大的工具。chemalot_knime软件包利用KNIME中的图形设计功能,便于生成和维护用户定义的命令行工作流程。图形摘要 带有chemalot节点和相应命令行管道的示例KNIME工作流程。