Institute of Technology, Resource and Energy-Efficient Engineering (TREE), University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany.
Department of Electrical Engineering, Mechanical Engineering and Technical Journalism, University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany.
J Chem Inf Model. 2023 Aug 14;63(15):4505-4532. doi: 10.1021/acs.jcim.3c00643. Epub 2023 Jul 19.
The field of computational chemistry has seen a significant increase in the integration of machine learning concepts and algorithms. In this Perspective, we surveyed 179 open-source software projects, with corresponding peer-reviewed papers published within the last 5 years, to better understand the topics within the field being investigated by machine learning approaches. For each project, we provide a short description, the link to the code, the accompanying license type, and whether the training data and resulting models are made publicly available. Based on those deposited in GitHub repositories, the most popular employed Python libraries are identified. We hope that this survey will serve as a resource to learn about machine learning or specific architectures thereof by identifying accessible codes with accompanying papers on a topic basis. To this end, we also include computational chemistry open-source software for generating training data and fundamental Python libraries for machine learning. Based on our observations and considering the three pillars of collaborative machine learning work, open data, open source (code), and open models, we provide some suggestions to the community.
计算化学领域已经看到机器学习概念和算法的集成显著增加。在本观点中,我们调查了 179 个开源软件项目,这些项目都有相应的同行评审论文在过去 5 年内发表,以更好地了解机器学习方法所研究的领域内的主题。对于每个项目,我们提供简短描述、代码链接、伴随的许可证类型,以及是否公开提供训练数据和生成的模型。基于那些存储在 GitHub 存储库中的项目,确定了最流行的使用 Python 库。我们希望通过基于主题识别具有伴随论文的可访问代码,将这个调查作为了解机器学习或其特定架构的资源。为此,我们还提供了用于生成训练数据的计算化学开源软件和用于机器学习的基本 Python 库。基于我们的观察,并考虑协作机器学习工作的三个支柱,即开放数据、开源(代码)和开放模型,我们向社区提供了一些建议。