Steinbeck Christoph
Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University, Jena, Germany.
J Cheminform. 2025 Apr 3;17(1):44. doi: 10.1186/s13321-025-00990-w.
Cheminformatics has significantly transformed over the past four decades, evolving from a field dominated by proprietary systems to one increasingly embracing open science principles. In its early years, cheminformatics was characterised by commercial software and restricted data access, limiting collaboration and reproducibility. The advent of open-source software in the late 1990s and early 2000s, including tools such as the Chemistry Development Kit (CDK) and RDKit, played a crucial role in democratising computational chemistry. Open data initiatives, such as PubChem and NMRShiftDB, further enhanced accessibility by providing freely available chemical information, fostering transparency and interoperability and introducing key standards, such as the International Chemical Identifier (InChI), revolutionised data integration and retrieval across diverse platforms. Community-driven efforts, including the Blue Obelisk movement and Open Notebook Science, have promoted open methodologies and collaborative research. More recently, national data infrastructure projects like NFDI4Chem have aimed to standardise research data management in cheminformatics, ensuring the long-term sustainability of open science practices. The increasing adoption of the FAIR (Findable, Accessible, Interoperable, Reusable) principles has further reinforced data sharing and reuse in computational chemistry. Challenges remain, particularly in overcoming resistance to data sharing and ensuring sustainable funding for open projects. However, the trajectory of cheminformatics demonstrates that embracing openness enhances scientific integrity and accelerates discovery and innovation.
在过去的四十年里,化学信息学发生了显著的转变,从一个由专有系统主导的领域发展成为一个越来越多地遵循开放科学原则的领域。在其早期,化学信息学的特点是商业软件和受限的数据访问,这限制了合作和可重复性。20世纪90年代末和21世纪初开源软件的出现,包括化学开发工具包(CDK)和RDKit等工具,在使计算化学民主化方面发挥了关键作用。诸如PubChem和NMRShiftDB等开放数据计划,通过提供免费可用的化学信息,进一步提高了可访问性,促进了透明度和互操作性,并引入了关键标准,如国际化学标识符(InChI),彻底改变了跨不同平台的数据集成和检索。包括蓝色方尖碑运动和开放笔记本科学在内的社区驱动的努力,推动了开放方法和合作研究。最近,像NFDI4Chem这样的国家数据基础设施项目旨在规范化学信息学中的研究数据管理,确保开放科学实践的长期可持续性。越来越多地采用FAIR(可查找、可访问、可互操作、可重用)原则,进一步加强了计算化学中的数据共享和重用。挑战依然存在,特别是在克服对数据共享的抵制以及确保开放项目的可持续资金方面。然而,化学信息学的发展轨迹表明,拥抱开放性可以提高科学诚信,并加速发现和创新。