Khan Aminul Islam, Kim Min Jun, Dutta Prashanta
School of Mechanical and Materials Engineering, Washington State University, Pullman, WA, 99164, USA.
Department of Mechanical Engineering, Southern Methodist University, Dallas, TX, 75275, USA.
J Signal Process Syst. 2022 Dec;94(12):1515-1529. doi: 10.1007/s11265-022-01758-3. Epub 2022 Apr 12.
Accurate and precise identification of adeno-associated virus (AAV) vectors play an important role in dose-dependent gene therapy. Although solid-state nanopore techniques can potentially be used to characterize AAV vectors by capturing ionic current, the existing data analysis techniques fall short of identifying them from their ionic current profiles. Recently introduced machine learning methods such as deep convolutional neural network (CNN), developed for image identification tasks, can be applied for such classification. However, with smaller data set for the problem in hand, it is not possible to train a deep neural network from scratch for accurate classification of AAV vectors. To circumvent this, we applied a pre-trained deep CNN (GoogleNet) model to capture the basic features from ionic current signals and subsequently used fine-tuning-based transfer learning to classify AAV vectors. The proposed method is very generic as it requires minimal preprocessing and does not require any handcrafted features. Our results indicate that fine-tuning-based transfer learning can achieve an average classification accuracy between 90 and 99% in three realizations with a very small standard deviation. Results also indicate that the classification accuracy depends on the applied electric field (across nanopore) and the time frame used for data segmentation. We also found that the fine-tuning of the deep network outperforms feature extraction-based classification for the resistive pulse dataset. To expand the usefulness of the fine-tuning-based transfer learning, we have tested two other pre-trained deep networks (ResNet50 and InceptionV3) for the classification of AAVs. Overall, the fine-tuning-based transfer learning from pre-trained deep networks is very effective for classification, though deep networks such as ResNet50 and InceptionV3 take significantly longer training time than GoogleNet.
准确精确地识别腺相关病毒(AAV)载体在剂量依赖性基因治疗中起着重要作用。尽管固态纳米孔技术有可能通过捕获离子电流来表征AAV载体,但现有的数据分析技术仍无法从其离子电流轮廓中识别它们。最近引入的机器学习方法,如为图像识别任务开发的深度卷积神经网络(CNN),可用于此类分类。然而,由于手头问题的数据集较小,不可能从头开始训练深度神经网络以准确分类AAV载体。为了规避这一问题,我们应用了预训练的深度CNN(GoogleNet)模型来从离子电流信号中捕获基本特征,随后使用基于微调的迁移学习来分类AAV载体。所提出的方法非常通用,因为它需要最少的预处理,并且不需要任何手工制作的特征。我们的结果表明,基于微调的迁移学习在三次实现中可以实现90%至99%的平均分类准确率,标准差非常小。结果还表明,分类准确率取决于施加的电场(跨纳米孔)和用于数据分割的时间框架。我们还发现,对于电阻脉冲数据集,深度网络的微调优于基于特征提取的分类。为了扩大基于微调的迁移学习的实用性,我们测试了另外两个预训练的深度网络(ResNet50和InceptionV3)用于AAV的分类。总体而言,基于预训练深度网络的微调迁移学习对于分类非常有效,尽管ResNet50和InceptionV3等深度网络的训练时间比GoogleNet长得多。