单类分类作为加速π-π共晶发现的一种实用方法。

One class classification as a practical approach for accelerating π-π co-crystal discovery.

作者信息

Vriza Aikaterini, Canaj Angelos B, Vismara Rebecca, Kershaw Cook Laurence J, Manning Troy D, Gaultois Michael W, Wood Peter A, Kurlin Vitaliy, Berry Neil, Dyer Matthew S, Rosseinsky Matthew J

机构信息

Department of Chemistry and Materials Innovation Factory, University of Liverpool 51 Oxford Street Liverpool L7 3NY UK

Leverhulme Research Centre for Functional Materials Design, University of Liverpool Oxford Street Liverpool L7 3NY UK.

出版信息

Chem Sci. 2020 Dec 8;12(5):1702-1719. doi: 10.1039/d0sc04263c.

DOI:10.1039/d0sc04263c

PMID:34163930

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8179233/

Abstract

The implementation of machine learning models has brought major changes in the decision-making process for materials design. One matter of concern for the data-driven approaches is the lack of negative data from unsuccessful synthetic attempts, which might generate inherently imbalanced datasets. We propose the application of the one-class classification methodology as an effective tool for tackling these limitations on the materials design problems. This is a concept of learning based only on a well-defined class without counter examples. An extensive study on the different one-class classification algorithms is performed until the most appropriate workflow is identified for guiding the discovery of emerging materials belonging to a relatively small class, that being the weakly bound polyaromatic hydrocarbon co-crystals. The two-step approach presented in this study first trains the model using all the known molecular combinations that form this class of co-crystals extracted from the Cambridge Structural Database (1722 molecular combinations), followed by scoring possible yet unknown pairs from the ZINC15 database (21 736 possible molecular combinations). Focusing on the highest-ranking pairs predicted to have higher probability of forming co-crystals, materials discovery can be accelerated by reducing the vast molecular space and directing the synthetic efforts of chemists. Further on, using interpretability techniques a more detailed understanding of the molecular properties causing co-crystallization is sought after. The applicability of the current methodology is demonstrated with the discovery of two novel co-crystals, namely pyrene-6-benzo[]chromen-6-one () and pyrene-9,10-dicyanoanthracene ().

摘要

机器学习模型的应用给材料设计的决策过程带来了重大变革。数据驱动方法的一个关注点是缺乏来自未成功合成尝试的负面数据，这可能会产生内在不平衡的数据集。我们提出将单类分类方法作为解决材料设计问题中这些局限性的有效工具。这是一种仅基于一个定义明确的类别进行学习而没有反例的概念。我们对不同的单类分类算法进行了广泛研究，直到确定最合适的工作流程，以指导发现属于相对较小类别的新兴材料，即弱键合多环芳烃共晶体。本研究中提出的两步法首先使用从剑桥结构数据库中提取的形成此类共晶体的所有已知分子组合（1722个分子组合）训练模型，然后对ZINC15数据库中可能但未知的对进行评分（21736个可能的分子组合）。专注于预测形成共晶体概率较高的排名靠前的对，可以通过减少巨大的分子空间并指导化学家的合成工作来加速材料发现。此外，使用可解释性技术，寻求对导致共结晶的分子性质有更详细的了解。通过发现两种新型共晶体，即芘 - 6 - 苯并[]色烯 - 6 - 酮（）和芘 - 9,10 - 二氰基蒽（），证明了当前方法的适用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c26c/8179233/67dec5767466/d0sc04263c-f1.jpg

相似文献

One class classification as a practical approach for accelerating π-π co-crystal discovery.

Chem Sci. 2020 Dec 8;12(5):1702-1719. doi: 10.1039/d0sc04263c.

Affinity and class probability-based fuzzy support vector machine for imbalanced data sets.

Neural Netw. 2020 Feb;122:289-307. doi: 10.1016/j.neunet.2019.10.016. Epub 2019 Nov 2.

Co-crystal Prediction by Artificial Neural Networks*.

Angew Chem Int Ed Engl. 2020 Nov 23;59(48):21711-21718. doi: 10.1002/anie.202009467. Epub 2020 Sep 18.

Structure-Packing-Property Correlation of Self-Sorted Versus Interdigitated Assembly in TTF⋅TCNQ-Based Charge-Transport Materials.

Chemistry. 2018 Aug 22;24(47):12318-12329. doi: 10.1002/chem.201705537. Epub 2018 Feb 15.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Inverse free reduced universum twin support vector machine for imbalanced data classification.

Neural Netw. 2023 Jan;157:125-135. doi: 10.1016/j.neunet.2022.10.003. Epub 2022 Oct 15.

Learning to improve medical decision making from imbalanced data without a priori cost.

BMC Med Inform Decis Mak. 2014 Dec 5;14:111. doi: 10.1186/s12911-014-0111-9.

Improved support vector machine classification for imbalanced medical datasets by novel hybrid sampling combining modified mega-trend-diffusion and bagging extreme learning machine model.

Math Biosci Eng. 2023 Sep 15;20(10):17672-17701. doi: 10.3934/mbe.2023786.

Explanation and prediction of clinical data with imbalanced class distribution based on pattern discovery and disentanglement.

BMC Med Inform Decis Mak. 2021 Jan 9;21(1):16. doi: 10.1186/s12911-020-01356-y.

Macromolecular crowding: chemistry and physics meet biology (Ascona, Switzerland, 10-14 June 2012).

Phys Biol. 2013 Aug;10(4):040301. doi: 10.1088/1478-3975/10/4/040301. Epub 2013 Aug 2.

引用本文的文献

High-throughput encapsulated nanodroplet screening for accelerated co-crystal discovery.

Chem Sci. 2025 Apr 22. doi: 10.1039/d4sc07556k.

Going beyond the Ordered Bulk: A Perspective on the Use of the Cambridge Structural Database for Predictive Materials Design.

Cryst Growth Des. 2024 Aug 19;24(17):6911-6930. doi: 10.1021/acs.cgd.4c00694. eCollection 2024 Sep 4.

One class classification for the detection of β2 adrenergic receptor agonists using single-ligand dynamic interaction data.

J Cheminform. 2022 Oct 29;14(1):74. doi: 10.1186/s13321-022-00654-z.

High-throughput virtual screening for organic electronics: a comparative study of alternative strategies.

J Mater Chem C Mater. 2021 Sep 16;9(39):13557-13583. doi: 10.1039/d1tc03256a. eCollection 2021 Oct 14.

Coupling complementary strategy to flexible graph neural network for quick discovery of coformer in diverse co-crystal materials.

Nat Commun. 2021 Oct 12;12(1):5950. doi: 10.1038/s41467-021-26226-7.

本文引用的文献

: from visualization to analysis, design and prediction.

J Appl Crystallogr. 2020 Feb 1;53(Pt 1):226-235. doi: 10.1107/S1600576719014092.

Organic Cocrystals: Beyond Electrical Conductivities and Field-Effect Transistors (FETs).

Angew Chem Int Ed Engl. 2019 Jul 15;58(29):9696-9711. doi: 10.1002/anie.201900501. Epub 2019 Apr 29.

Binary charge-transfer complexes using pyromellitic acid dianhydride featuring C-H⋯O hydrogen bonds.

Acta Crystallogr E Crystallogr Commun. 2018 Nov 9;74(Pt 12):1772-1777. doi: 10.1107/S2056989018015645. eCollection 2018 Dec 1.

SIMCA Modeling for Overlapping Classes: Fixed or Optimized Decision Threshold?

Anal Chem. 2018 Sep 18;90(18):10738-10747. doi: 10.1021/acs.analchem.8b01270. Epub 2018 Sep 6.

Machine learning for molecular and materials science.

Nature. 2018 Jul;559(7715):547-555. doi: 10.1038/s41586-018-0337-2. Epub 2018 Jul 25.

Evaluating the Energetic Driving Force for Cocrystal Formation.

Cryst Growth Des. 2018 Feb 7;18(2):892-904. doi: 10.1021/acs.cgd.7b01375. Epub 2017 Dec 13.

Molecular cocrystals: design, charge-transfer and optoelectronic functionality.

Phys Chem Chem Phys. 2018 Feb 28;20(9):6009-6023. doi: 10.1039/c7cp07167a.

Photoconductivity and magnetoconductance effects on vacuum vapor deposition films of weak charge-transfer complexes.

Phys Chem Chem Phys. 2017 Jul 26;19(29):18845-18853. doi: 10.1039/c7cp02781h.

Beyond Rotatable Bond Counts: Capturing 3D Conformational Flexibility in a Single Descriptor.

J Chem Inf Model. 2016 Dec 27;56(12):2347-2352. doi: 10.1021/acs.jcim.6b00565. Epub 2016 Dec 6.

Synthesis and structures of 11,11,12,12-tetracyano-2,6-diiodo-9,10-anthraquinodimethane and its 2:1 cocrystals with anthracene, pyrene and tetrathiafulvalene.

Acta Crystallogr C Struct Chem. 2016 Dec 1;72(Pt 12):923-931. doi: 10.1107/S2053229616016387. Epub 2016 Nov 4.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

单类分类作为加速π-π共晶发现的一种实用方法。

One class classification as a practical approach for accelerating π-π co-crystal discovery.

作者信息

Vriza Aikaterini, Canaj Angelos B, Vismara Rebecca, Kershaw Cook Laurence J, Manning Troy D, Gaultois Michael W, Wood Peter A, Kurlin Vitaliy, Berry Neil, Dyer Matthew S, Rosseinsky Matthew J

机构信息

Department of Chemistry and Materials Innovation Factory, University of Liverpool 51 Oxford Street Liverpool L7 3NY UK

Leverhulme Research Centre for Functional Materials Design, University of Liverpool Oxford Street Liverpool L7 3NY UK.

出版信息

Chem Sci. 2020 Dec 8;12(5):1702-1719. doi: 10.1039/d0sc04263c.

DOI:10.1039/d0sc04263c

PMID:34163930

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8179233/

Abstract

摘要

单类分类作为加速π-π共晶发现的一种实用方法。

One class classification as a practical approach for accelerating π-π co-crystal discovery.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

单类分类作为加速π-π共晶发现的一种实用方法。

One class classification as a practical approach for accelerating π-π co-crystal discovery.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献