Suppr超能文献

化学分子设计用语言模型。

Chemical language models for molecular design.

机构信息

Department of Life Science Informatics, Bonn-Aachen International Center for Information Technology, Rheinische Friedrich-Wilhelms-Universität Bonn, Friedrich-Hirzebruch-Allee 5/6, D-53115, Bonn, Germany.

Lamarr Institute for Machine Learning and Artificial Intelligence, Rheinische Friedrich-Wilhelms-Universität Bonn, Friedrich-Hirzebruch-Allee 5/6, D-53115, Bonn, Germany.

出版信息

Mol Inform. 2024 Jan;43(1):e202300288. doi: 10.1002/minf.202300288. Epub 2023 Dec 12.

Abstract

In drug discovery, chemical language models (CLMs) originating from natural language processing offer new opportunities for molecular design. CLMs have been developed using recurrent neural network (RNN) or transformer architectures. For the predictive performance of RNN-based encoder-decoder frameworks and transformers, attention mechanisms play a central role. Among others, emerging application areas for CLMs include constrained generative modeling and the prediction of chemical reactions or drug-target interactions. Since CLMs are applicable to any compound or target data that can be presented in a sequential format and tokenized, mappings of different types of sequences can be learned. For example, active compounds can be predicted from protein sequence motifs. Novel off-the-beat-path applications can also be considered. For example, analogue series from medicinal chemistry can be perceived and represented as chemical sequences and extended with new compounds using CLMs. Herein, methodological features of CLMs and different applications are discussed.

摘要

在药物发现中,源自自然语言处理的化学语言模型 (CLM) 为分子设计提供了新的机会。CLM 是使用递归神经网络 (RNN) 或转换器架构开发的。对于基于 RNN 的编码器-解码器框架和转换器的预测性能,注意力机制起着核心作用。除其他外,CLM 的新兴应用领域包括受约束的生成建模以及化学反应或药物-靶标相互作用的预测。由于 CLM 适用于可以以序列格式呈现和标记的任何化合物或靶标数据,因此可以学习不同类型序列的映射。例如,可以从蛋白质序列基序中预测活性化合物。也可以考虑新颖的非传统应用。例如,可以将药物化学中的类似物系列视为化学序列,并使用 CLM 用新化合物进行扩展。本文讨论了 CLM 的方法特征和不同的应用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验