Department of Chemistry and Biochemistry, University of Texas at El Paso, El Paso, Texas 79968 United States.
School of Convergence Engineering, Pusan National University, Yangsan 50612, South Korea.
ACS Nano. 2022 Jan 25;16(1):736-745. doi: 10.1021/acsnano.1c08271. Epub 2021 Dec 20.
DNA-wrapped single walled carbon nanotube (SWNT) conjugates have distinct optical properties leading to their use in biosensing and imaging applications. A critical limitation in the development of DNA-SWNT sensors is the current inability to predict unique DNA sequences that confer a strong analyte-specific optical response to these sensors. Here, near-infrared (nIR) fluorescence response data sets for ∼100 DNA-SWNT conjugates, narrowed down by a selective evolution protocol starting from a pool of ∼10 unique DNA-SWNT candidates, are used to train machine learning (ML) models to predict DNA sequences with strong optical response to neurotransmitter serotonin. First, classifier models based on convolutional neural networks (CNN) are trained on sequence features to classify DNA ligands as either high response or low response to serotonin. Second, support vector machine (SVM) regression models are trained to predict relative optical response values for DNA sequences. Finally, we demonstrate with validation experiments that integrating the predictions of ensembles of the highest quality neural network classifiers (convolutional or artificial) and SVM regression models leads to the best predictions of both high and low response sequences. With our ML approaches, we discovered five DNA-SWNT sensors with higher fluorescence intensity response to serotonin than obtained previously. Overall, the explored ML approaches, shown to predict useful DNA sequences, can be used for discovery of DNA-based sensors and nanobiotechnologies.
DNA 包裹的单壁碳纳米管(SWNT)缀合物具有独特的光学性质,使其可用于生物传感和成像应用。在开发 DNA-SWNT 传感器的过程中,一个关键的局限性是目前无法预测赋予这些传感器对特定分析物具有强光学响应的独特 DNA 序列。在这里,使用近红外(nIR)荧光响应数据集,这些数据集来自于从大约 10 个独特的 DNA-SWNT 候选物中通过选择性进化方案缩小范围的约 100 个 DNA-SWNT 缀合物,训练机器学习(ML)模型来预测对神经递质血清素有强光学响应的 DNA 序列。首先,基于卷积神经网络(CNN)的分类器模型基于序列特征进行训练,以将 DNA 配体分类为对血清素具有高响应或低响应。其次,训练支持向量机(SVM)回归模型来预测 DNA 序列的相对光学响应值。最后,我们通过验证实验证明,集成最高质量神经网络分类器(卷积或人工)和 SVM 回归模型的预测结果,可以对高响应和低响应序列进行最佳预测。通过我们的 ML 方法,我们发现了五个对血清素有更高荧光强度响应的 DNA-SWNT 传感器,比以前获得的结果更好。总体而言,所探索的 ML 方法,用于预测有用的 DNA 序列,可以用于发现基于 DNA 的传感器和纳米生物技术。