Kwon Youngchun, Jeon Hyunjeong, Choi Joonhyuk, Choi Youn-Suk, Kang Seokho
Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon, Republic of Korea.
Department of Industrial Engineering, Sungkyunkwan University, 2066 Seobu-ro, Jangan-gu, Suwon, Republic of Korea.
J Cheminform. 2025 Apr 10;17(1):51. doi: 10.1186/s13321-025-00987-5.
In synthesis planning, identifying and optimizing chemical reactions are important for the successful design of synthetic pathways to target substances. Chemical reaction databases assist chemists in gaining insights into this process. Traditionally, searching for relevant records from a reaction database has relied on the manual formulation of queries by chemists based on their search purposes, which is challenging without explicit knowledge of what they are searching for. In this study, we propose an intelligent chemical reaction search system that simplifies the process of enhancing the search results. When a user submits a query, a list of relevant records is retrieved from the reaction database. Users can express their preferences and requirements by providing binary ratings for the individual retrieved records. The search results are refined based on the user feedback. To implement this system effectively, we incorporate and adapt contrastive representation learning, dimensionality reduction, and human-in-the-loop techniques. Contrastive learning is used to train a representation model that embeds records in the reaction database as numerical vectors suitable for chemical reaction searches. Dimensionality reduction is applied to compress these vectors, thereby enhancing the search efficiency. Human-in-the-loop is integrated to iteratively update the representation model by reflecting user feedback. Through experimental investigations, we demonstrate that the proposed method effectively improves the chemical reaction search towards better alignment with user preferences and requirements. Scientific contribution This study seeks to enhance the search functionality of chemical reaction databases by drawing inspiration from recommender systems. The proposed method simplifies the search process, offering an alternative to the complexity of formulating explicit query rules. We believe that the proposed method can assist users in efficiently discovering records relevant to target reactions, especially when they encounter difficulties in crafting detailed queries due to limited knowledge.
在合成规划中,识别和优化化学反应对于成功设计目标物质的合成途径至关重要。化学反应数据库有助于化学家深入了解这一过程。传统上,从反应数据库中搜索相关记录依赖于化学家根据搜索目的手动编写查询语句,而在不明确知道要搜索什么的情况下,这具有挑战性。在本研究中,我们提出了一种智能化学反应搜索系统,该系统简化了提高搜索结果的过程。当用户提交查询时,会从反应数据库中检索出相关记录列表。用户可以通过对每条检索到的记录提供二元评分来表达他们的偏好和要求。搜索结果会根据用户反馈进行优化。为了有效地实现这个系统,我们整合并采用了对比表示学习、降维和人工参与技术。对比学习用于训练一个表示模型,该模型将反应数据库中的记录嵌入为适合化学反应搜索的数值向量。降维用于压缩这些向量,从而提高搜索效率。人工参与被整合进来,通过反映用户反馈来迭代更新表示模型。通过实验研究,我们证明了所提出的方法有效地改进了化学反应搜索,使其更符合用户的偏好和要求。科学贡献 本研究旨在通过借鉴推荐系统来增强化学反应数据库的搜索功能。所提出的方法简化了搜索过程,为制定明确查询规则的复杂性提供了一种替代方案。我们相信,所提出的方法可以帮助用户有效地发现与目标反应相关的记录,特别是当他们由于知识有限而难以编写详细查询时。