Micron School of Materials Science and Engineering, Boise State University, Boise, ID 83725, USA.
Materials Science and Technology Division, U.S. Naval Research Laboratory, Washington, DC 20375, USA.
Molecules. 2022 May 27;27(11):3456. doi: 10.3390/molecules27113456.
Dye aggregates are of interest for excitonic applications, including biomedical imaging, organic photovoltaics, and quantum information systems. Dyes with large transition dipole moments (μ) are necessary to optimize coupling within dye aggregates. Extinction coefficients (ε) can be used to determine the μ of dyes, and so dyes with a large ε (>150,000 M−1cm−1) should be engineered or identified. However, dye properties leading to a large ε are not fully understood, and low-throughput methods of dye screening, such as experimental measurements or density functional theory (DFT) calculations, can be time-consuming. In order to screen large datasets of molecules for desirable properties (i.e., large ε and μ), a computational workflow was established using machine learning (ML), DFT, time-dependent (TD-) DFT, and molecular dynamics (MD). ML models were developed through training and validation on a dataset of 8802 dyes using structural features. A Classifier was developed with an accuracy of 97% and a Regressor was constructed with an R2 of above 0.9, comparing between experiment and ML prediction. Using the Regressor, the ε values of over 18,000 dyes were predicted. The top 100 dyes were further screened using DFT and TD-DFT to identify 15 dyes with a μ relative to a reference dye, pentamethine indocyanine dye Cy5. Two benchmark MD simulations were performed on Cy5 and Cy5.5 dimers, and it was found that MD could accurately capture experimental results. The results of this study exhibit that our computational workflow for identifying dyes with a large μ for excitonic applications is effective and can be used as a tool to develop new dyes for excitonic applications.
染料聚集体在激子应用中很有研究价值,包括生物医学成像、有机光伏和量子信息系统。对于染料聚集体,需要具有较大的跃迁偶极矩(μ)的染料以优化染料聚集体内的耦合。消光系数(ε)可用于确定染料的μ,因此应设计或识别ε>150000 M−1cm−1的大ε染料。然而,导致大ε的染料性质尚未完全了解,染料筛选的低通量方法,例如实验测量或密度泛函理论(DFT)计算,可能会很耗时。为了筛选具有理想性质(即大ε和μ)的大量分子数据集,使用机器学习(ML)、DFT、时间相关(TD-)DFT 和分子动力学(MD)建立了计算工作流程。通过使用结构特征对 8802 种染料的数据集进行训练和验证,开发了 ML 模型。使用分类器对数据集进行分类,准确率为 97%,使用回归器对数据集进行回归,R2 大于 0.9,将实验值与 ML 预测值进行比较。使用回归器预测了超过 18000 种染料的ε值。使用 DFT 和 TD-DFT 进一步筛选前 100 种染料,以确定相对于参考染料五甲川花菁染料 Cy5 的 15 种染料的μ。对 Cy5 和 Cy5.5 二聚体进行了两次基准 MD 模拟,发现 MD 可以准确捕获实验结果。这项研究的结果表明,我们用于识别激子应用中具有大μ的染料的计算工作流程是有效的,可作为开发用于激子应用的新型染料的工具。