Suppr超能文献

化学开发工具包(CDK)v2.0:原子类型标注、描绘、分子式及子结构搜索。

The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching.

作者信息

Willighagen Egon L, Mayfield John W, Alvarsson Jonathan, Berg Arvid, Carlsson Lars, Jeliazkova Nina, Kuhn Stefan, Pluskal Tomáš, Rojas-Chertó Miquel, Spjuth Ola, Torrance Gilleain, Evelo Chris T, Guha Rajarshi, Steinbeck Christoph

机构信息

Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, 6200 MD, Maastricht, The Netherlands.

NextMove Software Ltd, Cambridge, CB4 0EY, UK.

出版信息

J Cheminform. 2017 Jun 6;9(1):33. doi: 10.1186/s13321-017-0220-4.

Abstract

BACKGROUND

The Chemistry Development Kit (CDK) is a widely used open source cheminformatics toolkit, providing data structures to represent chemical concepts along with methods to manipulate such structures and perform computations on them. The library implements a wide variety of cheminformatics algorithms ranging from chemical structure canonicalization to molecular descriptor calculations and pharmacophore perception. It is used in drug discovery, metabolomics, and toxicology. Over the last 10 years, the code base has grown significantly, however, resulting in many complex interdependencies among components and poor performance of many algorithms.

RESULTS

We report improvements to the CDK v2.0 since the v1.2 release series, specifically addressing the increased functional complexity and poor performance. We first summarize the addition of new functionality, such atom typing and molecular formula handling, and improvement to existing functionality that has led to significantly better performance for substructure searching, molecular fingerprints, and rendering of molecules. Second, we outline how the CDK has evolved with respect to quality control and the approaches we have adopted to ensure stability, including a code review mechanism.

CONCLUSIONS

This paper highlights our continued efforts to provide a community driven, open source cheminformatics library, and shows that such collaborative projects can thrive over extended periods of time, resulting in a high-quality and performant library. By taking advantage of community support and contributions, we show that an open source cheminformatics project can act as a peer reviewed publishing platform for scientific computing software. Graphical abstract CDK 2.0 provides new features and improved performance.

摘要

背景

化学开发工具包(CDK)是一个广泛使用的开源化学信息学工具包,它提供表示化学概念的数据结构以及操作这些结构并对其进行计算的方法。该库实现了从化学结构规范化到分子描述符计算和药效团识别等各种各样的化学信息学算法。它被用于药物发现、代谢组学和毒理学领域。然而,在过去10年中,代码库显著增长,导致组件之间存在许多复杂的相互依赖关系,并且许多算法的性能不佳。

结果

我们报告了自v1.2发布系列以来CDK v2.0的改进情况,特别针对功能复杂性增加和性能不佳的问题。我们首先总结了新功能的添加,如原子类型和分子式处理,以及对现有功能的改进,这些改进显著提高了子结构搜索、分子指纹和分子渲染的性能。其次,我们概述了CDK在质量控制方面的发展以及我们为确保稳定性所采用的方法,包括代码审查机制。

结论

本文强调了我们持续努力提供一个由社区驱动的开源化学信息学库,并表明这样的合作项目可以长期蓬勃发展,从而产生一个高质量且高性能的库。通过利用社区支持和贡献,我们表明一个开源化学信息学项目可以充当科学计算软件的同行评审发布平台。图形摘要CDK 2.0提供了新功能并提高了性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/465e/5461230/c3416ce61b04/13321_2017_220_Figa_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验