• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

大简化分子线性输入规范(BigSMILES):一种用于描述大分子的基于结构的线性符号表示法。

BigSMILES: A Structurally-Based Line Notation for Describing Macromolecules.

作者信息

Lin Tzyy-Shyang, Coley Connor W, Mochigase Hidenobu, Beech Haley K, Wang Wencong, Wang Zi, Woods Eliot, Craig Stephen L, Johnson Jeremiah A, Kalow Julia A, Jensen Klavs F, Olsen Bradley D

机构信息

Department of Chemical Engineering and Department of Chemistry, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States.

Department of Chemistry, Duke University, Durham, North Carolina 27708, United States.

出版信息

ACS Cent Sci. 2019 Sep 25;5(9):1523-1531. doi: 10.1021/acscentsci.9b00476. Epub 2019 Sep 12.

DOI:10.1021/acscentsci.9b00476
PMID:31572779
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6764162/
Abstract

Having a compact yet robust structurally based identifier or representation system is a key enabling factor for efficient sharing and dissemination of research results within the chemistry community, and such systems lay down the essential foundations for future informatics and data-driven research. While substantial advances have been made for small molecules, the polymer community has struggled in coming up with an efficient representation system. This is because, unlike other disciplines in chemistry, the basic premise that each distinct chemical species corresponds to a well-defined chemical structure does not hold for polymers. Polymers are intrinsically stochastic molecules that are often ensembles with a distribution of chemical structures. This difficulty limits the applicability of all deterministic representations developed for small molecules. In this work, a new representation system that is capable of handling the stochastic nature of polymers is proposed. The new system is based on the popular "simplified molecular-input line-entry system" (SMILES), and it aims to provide representations that can be used as indexing identifiers for entries in polymer databases. As a pilot test, the entries of the standard data set of the glass transition temperature of linear polymers (Bicerano, 2002) were converted into the new BigSMILES language. Furthermore, it is hoped that the proposed system will provide a more effective language for communication within the polymer community and increase cohesion between the researchers within the community.

摘要

拥有一个紧凑而强大的基于结构的标识符或表示系统,是化学界高效共享和传播研究成果的关键推动因素,此类系统为未来的信息学和数据驱动研究奠定了重要基础。虽然小分子领域已取得重大进展,但聚合物领域在提出一个高效的表示系统方面却面临困难。这是因为,与化学中的其他学科不同,聚合物并不符合每个独特化学物种都对应一个明确化学结构的基本前提。聚合物本质上是随机分子,通常是具有化学结构分布的集合体。这一困难限制了为小分子开发的所有确定性表示方法的适用性。在这项工作中,提出了一种能够处理聚合物随机性质的新表示系统。新系统基于流行的“简化分子输入线性输入系统”(SMILES),旨在提供可作为聚合物数据库条目的索引标识符的表示方法。作为一项试点测试,线性聚合物玻璃化转变温度标准数据集(Bicerano,2002)的条目被转换为新的BigSMILES语言。此外,希望所提出的系统将为聚合物领域内的交流提供一种更有效的语言,并增强该领域内研究人员之间的凝聚力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2afc/6764162/8adb0ec01202/oc9b00476_0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2afc/6764162/17485bc23609/oc9b00476_0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2afc/6764162/18980d332d5c/oc9b00476_0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2afc/6764162/beed901678b5/oc9b00476_0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2afc/6764162/b0acc1dc12f0/oc9b00476_0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2afc/6764162/8adb0ec01202/oc9b00476_0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2afc/6764162/17485bc23609/oc9b00476_0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2afc/6764162/18980d332d5c/oc9b00476_0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2afc/6764162/beed901678b5/oc9b00476_0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2afc/6764162/b0acc1dc12f0/oc9b00476_0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2afc/6764162/8adb0ec01202/oc9b00476_0005.jpg

相似文献

1
BigSMILES: A Structurally-Based Line Notation for Describing Macromolecules.大简化分子线性输入规范(BigSMILES):一种用于描述大分子的基于结构的线性符号表示法。
ACS Cent Sci. 2019 Sep 25;5(9):1523-1531. doi: 10.1021/acscentsci.9b00476. Epub 2019 Sep 12.
2
Canonicalizing BigSMILES for Polymers with Defined Backbones.对具有确定主链的聚合物进行BigSMILES规范化。
ACS Polym Au. 2022 Dec 14;2(6):486-500. doi: 10.1021/acspolymersau.2c00009. Epub 2022 Oct 14.
3
Automated BigSMILES conversion workflow and dataset for homopolymeric macromolecules.用于均聚物大分子的自动化 BigSMILES 转换工作流程和数据集。
Sci Data. 2024 Apr 11;11(1):371. doi: 10.1038/s41597-024-03212-4.
4
Extending BigSMILES to non-covalent bonds in supramolecular polymer assemblies.将大简化分子线性输入系统扩展至超分子聚合物组装体中的非共价键。
Chem Sci. 2022 Sep 15;13(41):12045-12055. doi: 10.1039/d2sc02257e. eCollection 2022 Oct 26.
5
Predicting Polymers' Glass Transition Temperature by a Chemical Language Processing Model.通过化学语言处理模型预测聚合物的玻璃化转变温度
Polymers (Basel). 2021 Jun 7;13(11):1898. doi: 10.3390/polym13111898.
6
PolyDAT: A Generic Data Schema for Polymer Characterization.多 DAT:聚合物特性分析通用数据架构
J Chem Inf Model. 2021 Mar 22;61(3):1150-1163. doi: 10.1021/acs.jcim.1c00028. Epub 2021 Feb 22.
7
ChemProps: A RESTful API enabled database for composite polymer name standardization.化学属性:一个启用了RESTful API的用于复合聚合物名称标准化的数据库。
J Cheminform. 2021 Mar 12;13(1):22. doi: 10.1186/s13321-021-00502-6.
8
Polygrammar: Grammar for Digital Polymer Representation and Generation.多语法:数字聚合物表示与生成的语法
Adv Sci (Weinh). 2022 Aug;9(23):e2101864. doi: 10.1002/advs.202101864. Epub 2022 Jun 9.
9
Molecular representations in AI-driven drug discovery: a review and practical guide.人工智能驱动的药物发现中的分子表征:综述与实践指南
J Cheminform. 2020 Sep 17;12(1):56. doi: 10.1186/s13321-020-00460-5.
10
PI1M: A Benchmark Database for Polymer Informatics.PI1M:高分子信息学基准数据库。
J Chem Inf Model. 2020 Oct 26;60(10):4684-4690. doi: 10.1021/acs.jcim.0c00726. Epub 2020 Oct 8.

引用本文的文献

1
Sequential EXtreme Gradient Boosting-Based Descriptor Reduction for Size Prediction of Zwitterionic Polymer-Based Nanoparticles.基于顺序极端梯度提升的两性离子聚合物基纳米颗粒尺寸预测的描述符约简
ACS Omega. 2025 Jul 31;10(31):35146-35160. doi: 10.1021/acsomega.5c04425. eCollection 2025 Aug 12.
2
Hierarchical Sensing Framework for Polymer Degradation Monitoring: A Physics-Constrained Reinforcement Learning Framework for Programmable Material Discovery.用于聚合物降解监测的分层传感框架:一种用于可编程材料发现的物理约束强化学习框架。
Sensors (Basel). 2025 Jul 18;25(14):4479. doi: 10.3390/s25144479.
3
Representation of Molecules by Sequences of Instructions.

本文引用的文献

1
A graph-convolutional neural network model for the prediction of chemical reactivity.一种用于预测化学反应性的图卷积神经网络模型。
Chem Sci. 2018 Nov 26;10(2):370-377. doi: 10.1039/c8sc04228d. eCollection 2019 Jan 14.
2
Using Machine Learning To Predict Suitable Conditions for Organic Reactions.使用机器学习预测有机反应的合适条件。
ACS Cent Sci. 2018 Nov 28;4(11):1465-1476. doi: 10.1021/acscentsci.8b00357. Epub 2018 Nov 16.
3
Computer-Aided Screening of Conjugated Polymers for Organic Solar Cell: Classification by Random Forest.
通过指令序列对分子进行表示。
J Chem Inf Model. 2025 Aug 11;65(15):7936-7955. doi: 10.1021/acs.jcim.5c00354. Epub 2025 Jul 28.
4
The Role of Artificial Intelligence and Machine Learning in Polymer Characterization: Emerging Trends and Perspectives.人工智能和机器学习在聚合物表征中的作用:新兴趋势与展望
Chromatographia. 2025;88(5):357-363. doi: 10.1007/s10337-025-04406-7. Epub 2025 Apr 4.
5
CGsmiles: A Versatile Line Notation for Molecular Representations across Multiple Resolutions.CG 微笑式:一种适用于多分辨率分子表示的通用线性表示法。
J Chem Inf Model. 2025 Apr 14;65(7):3405-3419. doi: 10.1021/acs.jcim.5c00064. Epub 2025 Mar 24.
6
Database of Nonaqueous Proton-Conducting Materials.非水电质子传导材料数据库
ACS Appl Mater Interfaces. 2025 Mar 19;17(11):16901-16908. doi: 10.1021/acsami.4c22618. Epub 2025 Mar 9.
7
Functional monomer design for synthetically accessible polymers.用于合成可及聚合物的功能性单体设计
Chem Sci. 2025 Feb 13;16(11):4755-4767. doi: 10.1039/d4sc08617a. eCollection 2025 Mar 12.
8
Machine Learning in Polymer Research.聚合物研究中的机器学习
Adv Mater. 2025 Mar;37(11):e2413695. doi: 10.1002/adma.202413695. Epub 2025 Feb 9.
9
Molecular Dynamics (MD)-Derived Features for Canonical and Noncanonical Amino Acids.用于规范和非规范氨基酸的分子动力学(MD)衍生特征
J Chem Inf Model. 2025 Feb 24;65(4):1837-1849. doi: 10.1021/acs.jcim.4c02102. Epub 2025 Feb 2.
10
The TOXIN knowledge graph: supporting animal-free risk assessment of cosmetics.毒素知识图谱:支持无动物化妆品风险评估。
Database (Oxford). 2025 Jan 28;2025. doi: 10.1093/database/baae121.
用于有机太阳能电池的共轭聚合物的计算机辅助筛选:基于随机森林的分类
J Phys Chem Lett. 2018 May 17;9(10):2639-2646. doi: 10.1021/acs.jpclett.8b00635. Epub 2018 May 7.
4
Machine Learning in Computer-Aided Synthesis Planning.计算机辅助合成规划中的机器学习
Acc Chem Res. 2018 May 15;51(5):1281-1289. doi: 10.1021/acs.accounts.8b00087. Epub 2018 May 1.
5
MoleculeNet: a benchmark for molecular machine learning.分子网络:分子机器学习的一个基准
Chem Sci. 2017 Oct 31;9(2):513-530. doi: 10.1039/c7sc02664a. eCollection 2018 Jan 14.
6
Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules.使用数据驱动的分子连续表示法进行自动化学设计。
ACS Cent Sci. 2018 Feb 28;4(2):268-276. doi: 10.1021/acscentsci.7b00572. Epub 2018 Jan 12.
7
Polymer Informatics: Opportunities and Challenges.聚合物信息学:机遇与挑战。
ACS Macro Lett. 2017 Oct;6(10):1078-1082. doi: 10.1021/acsmacrolett.7b00228. Epub 2017 Sep 15.
8
Universal Cyclic Topology in Polymer Networks.聚合物网络中的通用循环拓扑结构。
Phys Rev Lett. 2016 May 6;116(18):188302. doi: 10.1103/PhysRevLett.116.188302. Epub 2016 May 5.
9
Machine Learning Strategy for Accelerated Design of Polymer Dielectrics.用于加速聚合物电介质设计的机器学习策略
Sci Rep. 2016 Feb 15;6:20952. doi: 10.1038/srep20952.
10
InChI, the IUPAC International Chemical Identifier.国际化学标识符(InChI),即国际纯粹与应用化学联合会的国际化学标识符。
J Cheminform. 2015 May 30;7:23. doi: 10.1186/s13321-015-0068-4. eCollection 2015.