MultiGran-SMILES：用于分子性质预测的多粒度 SMILES 学习。

MultiGran-SMILES: multi-granularity SMILES learning for molecular property prediction.

机构信息

School of Information Science and Engineering, Lanzhou University, Lanzhou 730000, China.

Key Laboratory of China's Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, Lanzhou 730030, China.

出版信息

Bioinformatics. 2022 Sep 30;38(19):4573-4580. doi: 10.1093/bioinformatics/btac550.

DOI:10.1093/bioinformatics/btac550

PMID:35961025

Abstract

MOTIVATION

Extracting useful molecular features is essential for molecular property prediction. Atom-level representation is a common representation of molecules, ignoring the sub-structure or branch information of molecules to some extent; however, it is vice versa for the substring-level representation. Both atom-level and substring-level representations may lose the neighborhood or spatial information of molecules. While molecular graph representation aggregating the neighborhood information of a molecule has a weak ability in expressing the chiral molecules or symmetrical structure. In this article, we aim to make use of the advantages of representations in different granularities simultaneously for molecular property prediction. To this end, we propose a fusion model named MultiGran-SMILES, which integrates the molecular features of atoms, sub-structures and graphs from the input. Compared with the single granularity representation of molecules, our method leverages the advantages of various granularity representations simultaneously and adjusts the contribution of each type of representation adaptively for molecular property prediction.

RESULTS

The experimental results show that our MultiGran-SMILES method achieves state-of-the-art performance on BBBP, LogP, HIV and ClinTox datasets. For the BACE, FDA and Tox21 datasets, the results are comparable with the state-of-the-art models. Moreover, the experimental results show that the gains of our proposed method are bigger for the molecules with obvious functional groups or branches.

AVAILABILITY AND IMPLEMENTATION

The code and data underlying this work are available on GitHub at https://github. com/Jiangjing0122/MultiGran.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

提取有用的分子特征对于分子性质预测至关重要。原子级表示是一种常见的分子表示方法，在某种程度上忽略了分子的子结构或分支信息；然而，子串级表示则相反。原子级和子串级表示都可能丢失分子的邻域或空间信息。而聚合分子邻域信息的分子图表示在表达手性分子或对称结构方面能力较弱。在本文中，我们旨在同时利用不同粒度表示的优势来进行分子性质预测。为此，我们提出了一种名为 MultiGran-SMILES 的融合模型，它从输入中集成了原子、子结构和图形的分子特征。与分子的单一粒度表示相比，我们的方法同时利用了各种粒度表示的优势，并自适应地调整每种表示类型的贡献，以进行分子性质预测。

结果

实验结果表明，我们的 MultiGran-SMILES 方法在 BBBP、LogP、HIV 和 ClinTox 数据集上达到了最先进的性能。对于 BACE、FDA 和 Tox21 数据集，结果与最先进的模型相当。此外，实验结果表明，对于具有明显官能团或分支的分子，我们提出的方法的增益更大。

可用性和实现

本工作的代码和数据可在 GitHub 上获得，网址为 https://github.com/Jiangjing0122/MultiGran。

补充信息

补充数据可在生物信息学在线获得。

相似文献

MultiGran-SMILES: multi-granularity SMILES learning for molecular property prediction.

Bioinformatics. 2022 Sep 30;38(19):4573-4580. doi: 10.1093/bioinformatics/btac550.

TranGRU: focusing on both the local and global information of molecules for molecular property prediction.

Appl Intell (Dordr). 2023;53(12):15246-15260. doi: 10.1007/s10489-022-04280-y. Epub 2022 Nov 14.

NoiseMol: A noise-robusted data augmentation via perturbing noise for molecular property prediction.

J Mol Graph Model. 2023 Jun;121:108454. doi: 10.1016/j.jmgm.2023.108454. Epub 2023 Mar 15.

MvMRL: a multi-view molecular representation learning method for molecular property prediction.

Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae298.

FraGAT: a fragment-oriented multi-scale graph attention model for molecular property prediction.

Bioinformatics. 2021 Sep 29;37(18):2981-2987. doi: 10.1093/bioinformatics/btab195.

Boosting the performance of molecular property prediction via graph-text alignment and multi-granularity representation enhancement.

J Mol Graph Model. 2024 Nov;132:108843. doi: 10.1016/j.jmgm.2024.108843. Epub 2024 Aug 5.

A deep learning framework for predicting molecular property based on multi-type features fusion.

Comput Biol Med. 2024 Feb;169:107911. doi: 10.1016/j.compbiomed.2023.107911. Epub 2023 Dec 28.

Cross-dependent graph neural networks for molecular property prediction.

Bioinformatics. 2022 Mar 28;38(7):2003-2009. doi: 10.1093/bioinformatics/btac039.

A deep learning method for predicting molecular properties and compound-protein interactions.

J Mol Graph Model. 2022 Dec;117:108283. doi: 10.1016/j.jmgm.2022.108283. Epub 2022 Aug 17.

MS-BACL: enhancing metabolic stability prediction through bond graph augmentation and contrastive learning.

Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae127.

引用本文的文献

A multimodal contrastive learning framework for predicting P-glycoprotein substrates and inhibitors.

J Pharm Anal. 2025 Aug;15(8):101313. doi: 10.1016/j.jpha.2025.101313. Epub 2025 Apr 16.

Prototype-based contrastive substructure identification for molecular property prediction.

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae565.

Cheminformatic Identification of Tyrosyl-DNA Phosphodiesterase 1 (Tdp1) Inhibitors: A Comparative Study of SMILES-Based Supervised Machine Learning Models.

J Pers Med. 2024 Sep 15;14(9):981. doi: 10.3390/jpm14090981.

Identifying High-Quality Leads among Screened Anticancerous Compounds Using SMILES Representations.

ACS Omega. 2024 Jun 28;9(28):30645-30653. doi: 10.1021/acsomega.4c02801. eCollection 2024 Jul 16.

Evidential meta-model for molecular property prediction.

Bioinformatics. 2023 Oct 3;39(10). doi: 10.1093/bioinformatics/btad604.

A fingerprints based molecular property prediction method using the BERT model.

J Cheminform. 2022 Oct 21;14(1):71. doi: 10.1186/s13321-022-00650-3.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

MultiGran-SMILES：用于分子性质预测的多粒度 SMILES 学习。

MultiGran-SMILES: multi-granularity SMILES learning for molecular property prediction.

机构信息

School of Information Science and Engineering, Lanzhou University, Lanzhou 730000, China.

Key Laboratory of China's Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, Lanzhou 730030, China.

出版信息

Bioinformatics. 2022 Sep 30;38(19):4573-4580. doi: 10.1093/bioinformatics/btac550.

DOI:10.1093/bioinformatics/btac550

PMID:35961025

Abstract

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

The code and data underlying this work are available on GitHub at https://github. com/Jiangjing0122/MultiGran.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

结果

可用性和实现

本工作的代码和数据可在 GitHub 上获得，网址为 https://github.com/Jiangjing0122/MultiGran。

补充信息

补充数据可在生物信息学在线获得。

MultiGran-SMILES：用于分子性质预测的多粒度 SMILES 学习。

MultiGran-SMILES: multi-granularity SMILES learning for molecular property prediction.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

补充信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

MultiGran-SMILES：用于分子性质预测的多粒度 SMILES 学习。

MultiGran-SMILES: multi-granularity SMILES learning for molecular property prediction.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

补充信息

相似文献

引用本文的文献