ANU College of Engineering, Computing and Cybernetics, The Australian National University, Canberra, ACT, 2600, Australia.
Data61, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Canberra, ACT, 2601, Australia.
Sci Rep. 2023 Mar 6;13(1):3742. doi: 10.1038/s41598-023-29395-1.
Optoelectric biosensors measure the conformational changes of biomolecules and their molecular interactions, allowing researchers to use them in different biomedical diagnostics and analysis activities. Among different biosensors, surface plasmon resonance (SPR)-based biosensors utilize label-free and gold-based plasmonic principles with high precision and accuracy, allowing these gold-based biosensors as one of the preferred methods. The dataset generated from these biosensors are being used in different machine learning (ML) models for disease diagnosis and prognosis, but there is a scarcity of models to develop or assess the accuracy of SPR-based biosensors and ensure a reliable dataset for downstream model development. Current study proposed innovative ML-based DNA detection and classification models from the reflective light angles on different gold surfaces of biosensors and associated properties. We have conducted several statistical analyses and different visualization techniques to evaluate the SPR-based dataset and applied t-SNE feature extraction and min-max normalization to differentiate classifiers of low-variances. We experimented with several ML classifiers, namely support vector machine (SVM), decision tree (DT), multi-layer perceptron (MLP), k-nearest neighbors (KNN), logistic regression (LR) and random forest (RF) and evaluated our findings in terms of different evaluation metrics. Our analysis showed the best accuracy of 0.94 by RF, DT and KNN for DNA classification and 0.96 by RF and KNN for DNA detection tasks. Considering area under the receiver operating characteristic curve (AUC) (0.97), precision (0.96) and F1-score (0.97), we found RF performed best for both tasks. Our research shows the potentiality of ML models in the field of biosensor development, which can be expanded to develop novel disease diagnosis and prognosis tools in the future.
光电生物传感器测量生物分子的构象变化及其分子相互作用,使研究人员能够将其应用于不同的生物医学诊断和分析活动中。在不同的生物传感器中,基于表面等离子体共振(SPR)的生物传感器利用无标记和基于金的等离子体原理,具有高精度和准确性,使这些基于金的生物传感器成为首选方法之一。这些生物传感器生成的数据集正被用于不同的机器学习(ML)模型进行疾病诊断和预后,但缺乏开发或评估基于 SPR 的生物传感器准确性并为下游模型开发提供可靠数据集的模型。当前的研究提出了基于创新 ML 的 DNA 检测和分类模型,该模型基于生物传感器不同金表面的反射光角度和相关特性。我们进行了几项统计分析和不同的可视化技术来评估基于 SPR 的数据集,并应用 t-SNE 特征提取和 min-max 归一化来区分低方差分类器。我们尝试了几种 ML 分类器,即支持向量机(SVM)、决策树(DT)、多层感知机(MLP)、k-最近邻(KNN)、逻辑回归(LR)和随机森林(RF),并根据不同的评估指标评估了我们的发现。我们的分析表明,RF、DT 和 KNN 在 DNA 分类方面的最佳准确率为 0.94,RF 和 KNN 在 DNA 检测任务方面的最佳准确率为 0.96。考虑到接收器操作特性曲线下的面积(AUC)(0.97)、精度(0.96)和 F1 得分(0.97),我们发现 RF 在这两个任务中表现最佳。我们的研究表明,ML 模型在生物传感器开发领域具有潜力,可以扩展到未来开发新的疾病诊断和预后工具。