• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

fragSMILES作为一种用于高级片段和手性表示的化学字符串表示法。

fragSMILES as a chemical string notation for advanced fragment and chirality representation.

作者信息

Mastrolorito Fabrizio, Ciriaco Fulvio, Togo Maria Vittoria, Gambacorta Nicola, Trisciuzzi Daniela, Altomare Cosimo Damiano, Amoroso Nicola, Grisoni Francesca, Nicolotti Orazio

机构信息

Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, Bari, Italy.

Dipartimento di Chimica, Università degli Studi di Bari Aldo Moro, Bari, Italy.

出版信息

Commun Chem. 2025 Jan 29;8(1):26. doi: 10.1038/s42004-025-01423-3.

DOI:10.1038/s42004-025-01423-3
PMID:39880917
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11779804/
Abstract

Generative models have revolutionized de novo drug design, allowing to produce molecules on-demand with desired physicochemical and pharmacological properties. String based molecular representations, such as SMILES (Simplified Molecular Input Line Entry System) and SELFIES (Self-Referencing Embedded Strings), have played a pivotal role in the success of generative approaches, thanks to their capacity to encode atom- and bond- information and ease-of-generation. However, such 'atom-level' string representations could have certain limitations, in terms of capturing information on chirality, and synthetic accessibility of the corresponding designs.In this paper, we present fragSMILES, a novel fragment-based molecular representation in the form of string. fragSMILES encode fragments in a 'chemically-meaningful' way via a novel graph-reduction approach, allowing to obtain an efficient, interpretable, and expressive molecular representation, which also avoids fragment redundancy. fragSMILES contributes to the field of fragment-based representation, by reporting fragments and their 'breaking' bonds independently. Moreover, fragSMILES also embeds information of molecular chirality, thereby overcoming known limitations of existing string notations. When compared with SMILES, SELFIES and t-SMILES for de novo design, the fragSMILES notation showed its promise in generating molecules with desirable biochemical and scaffolds properties.

摘要

生成模型彻底改变了从头药物设计,能够按需生成具有所需物理化学和药理特性的分子。基于字符串的分子表示法,如SMILES(简化分子输入线性输入系统)和SELFIES(自参考嵌入式字符串),由于其编码原子和键信息的能力以及易于生成,在生成方法的成功中发挥了关键作用。然而,这种“原子级”字符串表示法在捕捉手性信息以及相应设计的合成可及性方面可能存在某些局限性。在本文中,我们提出了fragSMILES,一种新颖的基于片段的字符串形式的分子表示法。fragSMILES通过一种新颖的图约简方法以“化学有意义”的方式编码片段,从而获得一种高效、可解释且富有表现力的分子表示法,同时还避免了片段冗余。fragSMILES通过独立报告片段及其“断裂”键,为基于片段的表示领域做出了贡献。此外,fragSMILES还嵌入了分子手性信息,从而克服了现有字符串表示法的已知局限性。与用于从头设计的SMILES、SELFIES和t-SMILES相比时fragSMILES表示法在生成具有理想生化性质和骨架性质的分子方面显示出了潜力。

相似文献

1
fragSMILES as a chemical string notation for advanced fragment and chirality representation.fragSMILES作为一种用于高级片段和手性表示的化学字符串表示法。
Commun Chem. 2025 Jan 29;8(1):26. doi: 10.1038/s42004-025-01423-3.
2
Recent advances in the self-referencing embedded strings (SELFIES) library.自引用嵌入字符串(SELFIES)库的最新进展。
Digit Discov. 2023 Jul 1;2(4):897-908. doi: 10.1039/d3dd00044c. eCollection 2023 Aug 8.
3
SELFIES and the future of molecular string representations.自拍与分子串表示法的未来。
Patterns (N Y). 2022 Oct 14;3(10):100588. doi: 10.1016/j.patter.2022.100588.
4
MolGPT: Molecular Generation Using a Transformer-Decoder Model.MolGPT:基于 Transformer-Decoder 模型的分子生成。
J Chem Inf Model. 2022 May 9;62(9):2064-2076. doi: 10.1021/acs.jcim.1c00600. Epub 2021 Oct 25.
5
Bidirectional Molecule Generation with Recurrent Neural Networks.双向分子生成的递归神经网络。
J Chem Inf Model. 2020 Mar 23;60(3):1175-1183. doi: 10.1021/acs.jcim.9b00943. Epub 2020 Jan 16.
6
Reconstruction of lossless molecular representations from fingerprints.从指纹重建无损分子表示。
J Cheminform. 2023 Feb 23;15(1):26. doi: 10.1186/s13321-023-00693-0.
7
SMILES-based deep generative scaffold decorator for de-novo drug design.用于从头药物设计的基于SMILES的深度生成支架修饰器。
J Cheminform. 2020 May 29;12(1):38. doi: 10.1186/s13321-020-00441-8.
8
t-SMILES: a fragment-based molecular representation framework for de novo ligand design.t-SMILES:一种用于从头设计配体的基于片段的分子表示框架。
Nat Commun. 2024 Jun 11;15(1):4993. doi: 10.1038/s41467-024-49388-6.
9
DeLA-DrugSelf: Empowering multi-objective de novo design through SELFIES molecular representation.通过 SELFIES 分子表示实现赋权型多目标从头设计。
Comput Biol Med. 2024 Jun;175:108486. doi: 10.1016/j.compbiomed.2024.108486. Epub 2024 Apr 16.
10
UnCorrupt SMILES: a novel approach to de novo design.未腐败的SMILES:一种全新的从头设计方法。
J Cheminform. 2023 Feb 14;15(1):22. doi: 10.1186/s13321-023-00696-x.

引用本文的文献

1
Evaluation of chirality descriptors derived from SMILES heteroencoders.基于SMILES异编码器的手性描述符评估。
J Cheminform. 2025 Aug 31;17(1):137. doi: 10.1186/s13321-025-01080-7.
2
Generative Deep Learning for de Novo Drug Design─A Chemical Space Odyssey.用于从头药物设计的生成式深度学习——一场化学空间奥德赛。
J Chem Inf Model. 2025 Jul 28;65(14):7352-7372. doi: 10.1021/acs.jcim.5c00641. Epub 2025 Jul 9.

本文引用的文献

1
Comparing SMILES and SELFIES tokenization for enhanced chemical language modeling.比较 SMILES 和 SELFIES 标记化以增强化学语言建模。
Sci Rep. 2024 Oct 23;14(1):25016. doi: 10.1038/s41598-024-76440-8.
2
Chemical language modeling with structured state space sequence models.基于结构化状态空间序列模型的化学语言建模。
Nat Commun. 2024 Jul 22;15(1):6176. doi: 10.1038/s41467-024-50469-9.
3
t-SMILES: a fragment-based molecular representation framework for de novo ligand design.t-SMILES:一种用于从头设计配体的基于片段的分子表示框架。
Nat Commun. 2024 Jun 11;15(1):4993. doi: 10.1038/s41467-024-49388-6.
4
Difficulty in chirality recognition for Transformer architectures learning chemical structures from string representations.Transformer架构从字符串表示中学习化学结构时的手性识别困难。
Nat Commun. 2024 Feb 16;15(1):1197. doi: 10.1038/s41467-024-45102-8.
5
Molecular fragmentation as a crucial step in the AI-based drug development pathway.分子碎片化是基于人工智能的药物开发途径中的关键步骤。
Commun Chem. 2024 Feb 1;7(1):20. doi: 10.1038/s42004-024-01109-2.
6
Impact of word embedding models on text analytics in deep learning environment: a review.词嵌入模型对深度学习环境下文本分析的影响:综述
Artif Intell Rev. 2023 Feb 22:1-81. doi: 10.1007/s10462-023-10419-1.
7
Chemical language models for de novo drug design: Challenges and opportunities.从头开始设计药物的化学语言模型:挑战与机遇。
Curr Opin Struct Biol. 2023 Apr;79:102527. doi: 10.1016/j.sbi.2023.102527. Epub 2023 Feb 2.
8
Leveraging molecular structure and bioactivity with chemical language models for de novo drug design.利用分子结构和生物活性与化学语言模型进行从头药物设计。
Nat Commun. 2023 Jan 7;14(1):114. doi: 10.1038/s41467-022-35692-6.
9
Molecular Generative Model via Retrosynthetically Prepared Chemical Building Block Assembly.通过反向合成制备的化学构建块组装的分子生成模型。
Adv Sci (Weinh). 2023 Mar;10(8):e2206674. doi: 10.1002/advs.202206674. Epub 2023 Jan 3.
10
Exposing the Limitations of Molecular Machine Learning with Activity Cliffs.利用活性悬崖揭示分子机器学习的局限性。
J Chem Inf Model. 2022 Dec 12;62(23):5938-5951. doi: 10.1021/acs.jcim.2c01073. Epub 2022 Dec 1.